Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment

Peter Fayers

doi:10.1007/S11136-007-9197-1

Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment

Peter Fayers

Applied Health Sciences

Norwegian University of Science and Technology

Research output: Contribution to journal › Article › peer-review

62 Citations (Scopus)

Abstract

Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).

Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.

Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.

Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.

Original language	English
Pages (from-to)	187-194
Number of pages	8
Journal	Quality of Life Research
Volume	16
Issue number	Supplement 1
Early online date	7 Apr 2007
DOIs	https://doi.org/10.1007/S11136-007-9197-1
Publication status	Published - Aug 2007

Keywords

quality of life
item response theory
patient reported outcomes
health outcomes measurement
quality-of-life
measurement precision
regression-models
indicators
trials
scales

Access to Document

10.1007/S11136-007-9197-1Licence: Unspecified

Cite this

@article{82ec981c34f642b0a4179cf245167bf7,

title = "Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment",

abstract = "Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.",

keywords = "quality of life, item response theory, patient reported outcomes, health outcomes measurement, quality-of-life, measurement precision, regression-models, indicators, trials, scales",

author = "Peter Fayers",

year = "2007",

month = aug,

doi = "10.1007/S11136-007-9197-1",

language = "English",

volume = "16",

pages = "187--194",

journal = "Quality of Life Research",

issn = "0962-9343",

publisher = "Springer ",

number = "Supplement 1",

}

TY - JOUR

T1 - Applying item response theory and computer adaptive testing

T2 - The challenges for health outcomes assessment

AU - Fayers, Peter

PY - 2007/8

Y1 - 2007/8

N2 - Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.

AB - Objectives We review the papers presented at the NCI/ DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).Background IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.Results Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to diffrential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.Conclusions Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.

KW - quality of life

KW - item response theory

KW - patient reported outcomes

KW - health outcomes measurement

KW - quality-of-life

KW - measurement precision

KW - regression-models

KW - indicators

KW - trials

KW - scales

U2 - 10.1007/S11136-007-9197-1

DO - 10.1007/S11136-007-9197-1

M3 - Article

SN - 0962-9343

VL - 16

SP - 187

EP - 194

JO - Quality of Life Research

JF - Quality of Life Research

IS - Supplement 1

ER -

Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment

Abstract

Keywords

Access to Document

Fingerprint

Cite this