A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales

Neil W. Scott, Peter M. Fayers, Neil K. Aaronson, Andrew Bottomley, Alexander de Graeff, Mogens Groenvold, Chad Gundy, Michael Koller, Morten A. Petersen, Mirjam A. G. Sprangers, EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

Objective: Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal logistic regression.

Study Design and setting: Simulated data, representative of HRQoL scales with four-category items, were generated. The power and type I error rates of the DIF method were then investigated when, respectively, DIF was deliberately introduced and when no DIF was added. The sample size, scale length, floor effects (FEs) and significance level were varied.

Results: When there was no DIF, type I error rates were close to 5%. Detecting moderate uniform DIF in a two-item scale required a sample size of 300 per group for adequate (>80%) power. For longer scales, a sample size of 200 was adequate. Considerably larger sample sizes were required to detect nonuniform DIF, when there were extreme FEs or when a reduced type I error rate was required.

Conclusion: The impact of the number of items in the scale was relatively small. Ordinal logistic regression successfully detects DIF for HRQoL instruments with short scales. Sample size guidelines are provided. (C) 2008 Elsevier Inc. All rights reserved.

Original languageEnglish
Pages (from-to)288-295
Number of pages8
JournalJournal of Clinical Epidemiology
Volume62
Issue number3
DOIs
Publication statusPublished - Mar 2009

Keywords

  • Health-related quality of life
  • Differential item functioning
  • Ordinal logistic regression
  • Short scales
  • Computer simulations
  • Floor effects
  • mantel-haenszel procedures
  • logistic-regression
  • health applications
  • questionnaire
  • equivalence
  • invariance
  • QLQ-C30
  • bias

Cite this

Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., ... EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group (2009). A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales. Journal of Clinical Epidemiology, 62(3), 288-295. https://doi.org/10.1016/j.jclinepi.2008.06.003

A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales. / Scott, Neil W.; Fayers, Peter M. ; Aaronson, Neil K.; Bottomley, Andrew; de Graeff, Alexander; Groenvold, Mogens; Gundy, Chad; Koller, Michael; Petersen, Morten A.; Sprangers, Mirjam A. G.; EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group.

In: Journal of Clinical Epidemiology, Vol. 62, No. 3, 03.2009, p. 288-295.

Research output: Contribution to journalArticle

Scott, NW, Fayers, PM, Aaronson, NK, Bottomley, A, de Graeff, A, Groenvold, M, Gundy, C, Koller, M, Petersen, MA, Sprangers, MAG & EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group 2009, 'A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales', Journal of Clinical Epidemiology, vol. 62, no. 3, pp. 288-295. https://doi.org/10.1016/j.jclinepi.2008.06.003
Scott, Neil W. ; Fayers, Peter M. ; Aaronson, Neil K. ; Bottomley, Andrew ; de Graeff, Alexander ; Groenvold, Mogens ; Gundy, Chad ; Koller, Michael ; Petersen, Morten A. ; Sprangers, Mirjam A. G. ; EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group. / A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales. In: Journal of Clinical Epidemiology. 2009 ; Vol. 62, No. 3. pp. 288-295.
@article{14199b2cc5da420ca46083f808c1030e,
title = "A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales",
abstract = "Objective: Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal logistic regression.Study Design and setting: Simulated data, representative of HRQoL scales with four-category items, were generated. The power and type I error rates of the DIF method were then investigated when, respectively, DIF was deliberately introduced and when no DIF was added. The sample size, scale length, floor effects (FEs) and significance level were varied.Results: When there was no DIF, type I error rates were close to 5{\%}. Detecting moderate uniform DIF in a two-item scale required a sample size of 300 per group for adequate (>80{\%}) power. For longer scales, a sample size of 200 was adequate. Considerably larger sample sizes were required to detect nonuniform DIF, when there were extreme FEs or when a reduced type I error rate was required.Conclusion: The impact of the number of items in the scale was relatively small. Ordinal logistic regression successfully detects DIF for HRQoL instruments with short scales. Sample size guidelines are provided. (C) 2008 Elsevier Inc. All rights reserved.",
keywords = "Health-related quality of life, Differential item functioning, Ordinal logistic regression, Short scales, Computer simulations, Floor effects, mantel-haenszel procedures, logistic-regression, health applications, questionnaire, equivalence, invariance, QLQ-C30, bias",
author = "Scott, {Neil W.} and Fayers, {Peter M.} and Aaronson, {Neil K.} and Andrew Bottomley and {de Graeff}, Alexander and Mogens Groenvold and Chad Gundy and Michael Koller and Petersen, {Morten A.} and Sprangers, {Mirjam A. G.} and {EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group}",
year = "2009",
month = "3",
doi = "10.1016/j.jclinepi.2008.06.003",
language = "English",
volume = "62",
pages = "288--295",
journal = "Journal of Clinical Epidemiology",
issn = "0895-4356",
publisher = "Elsevier USA",
number = "3",

}

TY - JOUR

T1 - A simulation study provided sample size guidance for differential item functioning (DIF) studies using short scales

AU - Scott, Neil W.

AU - Fayers, Peter M.

AU - Aaronson, Neil K.

AU - Bottomley, Andrew

AU - de Graeff, Alexander

AU - Groenvold, Mogens

AU - Gundy, Chad

AU - Koller, Michael

AU - Petersen, Morten A.

AU - Sprangers, Mirjam A. G.

AU - EORTC Quality of Life Group; Quality of Life Cross-Cultural Meta-Analysis Group

PY - 2009/3

Y1 - 2009/3

N2 - Objective: Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal logistic regression.Study Design and setting: Simulated data, representative of HRQoL scales with four-category items, were generated. The power and type I error rates of the DIF method were then investigated when, respectively, DIF was deliberately introduced and when no DIF was added. The sample size, scale length, floor effects (FEs) and significance level were varied.Results: When there was no DIF, type I error rates were close to 5%. Detecting moderate uniform DIF in a two-item scale required a sample size of 300 per group for adequate (>80%) power. For longer scales, a sample size of 200 was adequate. Considerably larger sample sizes were required to detect nonuniform DIF, when there were extreme FEs or when a reduced type I error rate was required.Conclusion: The impact of the number of items in the scale was relatively small. Ordinal logistic regression successfully detects DIF for HRQoL instruments with short scales. Sample size guidelines are provided. (C) 2008 Elsevier Inc. All rights reserved.

AB - Objective: Differential item functioning (DIF) analyses are increasingly used to evaluate health-related quality of life (HRQoL) instruments, which often include relatively short subscales. Computer simulations were used to explore how various factors including scale length affect analysis of DIF by ordinal logistic regression.Study Design and setting: Simulated data, representative of HRQoL scales with four-category items, were generated. The power and type I error rates of the DIF method were then investigated when, respectively, DIF was deliberately introduced and when no DIF was added. The sample size, scale length, floor effects (FEs) and significance level were varied.Results: When there was no DIF, type I error rates were close to 5%. Detecting moderate uniform DIF in a two-item scale required a sample size of 300 per group for adequate (>80%) power. For longer scales, a sample size of 200 was adequate. Considerably larger sample sizes were required to detect nonuniform DIF, when there were extreme FEs or when a reduced type I error rate was required.Conclusion: The impact of the number of items in the scale was relatively small. Ordinal logistic regression successfully detects DIF for HRQoL instruments with short scales. Sample size guidelines are provided. (C) 2008 Elsevier Inc. All rights reserved.

KW - Health-related quality of life

KW - Differential item functioning

KW - Ordinal logistic regression

KW - Short scales

KW - Computer simulations

KW - Floor effects

KW - mantel-haenszel procedures

KW - logistic-regression

KW - health applications

KW - questionnaire

KW - equivalence

KW - invariance

KW - QLQ-C30

KW - bias

U2 - 10.1016/j.jclinepi.2008.06.003

DO - 10.1016/j.jclinepi.2008.06.003

M3 - Article

VL - 62

SP - 288

EP - 295

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

SN - 0895-4356

IS - 3

ER -