Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment

Research output: Contribution to journalArticle

54 Citations (Scopus)

Abstract

Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).

Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.

Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.

Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.

Original languageEnglish
Pages (from-to)187-194
Number of pages8
JournalQuality of Life Research
Volume16
Issue numberSupplement 1
Early online date7 Apr 2007
DOIs
Publication statusPublished - Aug 2007

Keywords

  • quality of life
  • item response theory
  • patient reported outcomes
  • health outcomes measurement
  • quality-of-life
  • measurement precision
  • regression-models
  • indicators
  • trials
  • scales

Cite this

Applying item response theory and computer adaptive testing : The challenges for health outcomes assessment. / Fayers, Peter.

In: Quality of Life Research, Vol. 16, No. Supplement 1, 08.2007, p. 187-194.

Research output: Contribution to journalArticle

@article{82ec981c34f642b0a4179cf245167bf7,
title = "Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment",
abstract = "Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.",
keywords = "quality of life, item response theory, patient reported outcomes, health outcomes measurement, quality-of-life, measurement precision, regression-models, indicators, trials, scales",
author = "Peter Fayers",
year = "2007",
month = "8",
doi = "10.1007/S11136-007-9197-1",
language = "English",
volume = "16",
pages = "187--194",
journal = "Quality of Life Research",
issn = "0962-9343",
publisher = "Springer",
number = "Supplement 1",

}

TY - JOUR

T1 - Applying item response theory and computer adaptive testing

T2 - The challenges for health outcomes assessment

AU - Fayers, Peter

PY - 2007/8

Y1 - 2007/8

N2 - Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.

AB - Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.

KW - quality of life

KW - item response theory

KW - patient reported outcomes

KW - health outcomes measurement

KW - quality-of-life

KW - measurement precision

KW - regression-models

KW - indicators

KW - trials

KW - scales

U2 - 10.1007/S11136-007-9197-1

DO - 10.1007/S11136-007-9197-1

M3 - Article

VL - 16

SP - 187

EP - 194

JO - Quality of Life Research

JF - Quality of Life Research

SN - 0962-9343

IS - Supplement 1

ER -