Predicting missing quality of life data that were later recovered

an empirical comparison of approaches

Research output: Contribution to journalArticle

9 Citations (Scopus)
1 Downloads (Pure)

Abstract

Background and Purpose: The aim was to compare simple imputation, multiple imputation, and modeling approaches to deal with ‘missing’ quality of life data. Data were obtained from five clinical trials, which employed a reminder system for follow-up questionnaires. Previous studies have compared imputation strategies by artificially removing data according to prespecified mechanisms. Our approach differs from previous study as actual collected data are utilized.

Methods: Data obtained by reminder were initially treated as missing. These missing values were imputed using a variety of simple and multiple imputation strategies. The trials were analyzed using the imputed datasets, and the resulting treatment effects compared to analyses using the full dataset including responses following reminders. A repeated measures model was also carried out on the available data and the pattern mixture models were employed. The accuracy of the different strategies was assessed by calculating the bias seen in the calculated treatment difference compared to the actual observed treatment difference.

Results: Baseline carried forward or last value carried forward were shown to be the best simple imputation methods in this setting. Multiple imputation using a regression model or predictive mean match model tended to provide treatment difference estimates with the least bias when compared to the actual observed data. Pattern mixture models did not perform well. Overall, the multiple imputation procedures were generally the least biased approaches.

Limitations: A number of imputation and modeling procedures have been investigated but this list is not exhaustive. All the example datasets come from the same data source and perhaps studies from additional disease areas would have been useful. However, we feel the results are generalizable to other quality of life outcomes and clinical areas.

Conclusions: Multiple imputation is recommended for missing quality of life data as it makes the assumption of missing at random which in the quality of life setting is more plausible than the assumption of missing completely at random for which most simple imputation methods are based. Pattern mixture models can be complex and did not perform well in this setting. Clinical Trials 2010; 7: 333—342. http://ctj.sagepub.com
Original languageEnglish
Pages (from-to)333-342
Number of pages10
JournalClinical Trials
Volume7
Issue number4
Early online date24 Jun 2010
DOIs
Publication statusPublished - Aug 2010

Fingerprint

Quality of Life
Reminder Systems
Clinical Trials
Information Storage and Retrieval
Datasets

Cite this

@article{bd3cc148f42e4bdf874e0aa8f191a632,
title = "Predicting missing quality of life data that were later recovered: an empirical comparison of approaches",
abstract = "Background and Purpose: The aim was to compare simple imputation, multiple imputation, and modeling approaches to deal with ‘missing’ quality of life data. Data were obtained from five clinical trials, which employed a reminder system for follow-up questionnaires. Previous studies have compared imputation strategies by artificially removing data according to prespecified mechanisms. Our approach differs from previous study as actual collected data are utilized.Methods: Data obtained by reminder were initially treated as missing. These missing values were imputed using a variety of simple and multiple imputation strategies. The trials were analyzed using the imputed datasets, and the resulting treatment effects compared to analyses using the full dataset including responses following reminders. A repeated measures model was also carried out on the available data and the pattern mixture models were employed. The accuracy of the different strategies was assessed by calculating the bias seen in the calculated treatment difference compared to the actual observed treatment difference.Results: Baseline carried forward or last value carried forward were shown to be the best simple imputation methods in this setting. Multiple imputation using a regression model or predictive mean match model tended to provide treatment difference estimates with the least bias when compared to the actual observed data. Pattern mixture models did not perform well. Overall, the multiple imputation procedures were generally the least biased approaches.Limitations: A number of imputation and modeling procedures have been investigated but this list is not exhaustive. All the example datasets come from the same data source and perhaps studies from additional disease areas would have been useful. However, we feel the results are generalizable to other quality of life outcomes and clinical areas.Conclusions: Multiple imputation is recommended for missing quality of life data as it makes the assumption of missing at random which in the quality of life setting is more plausible than the assumption of missing completely at random for which most simple imputation methods are based. Pattern mixture models can be complex and did not perform well in this setting. Clinical Trials 2010; 7: 333—342. http://ctj.sagepub.com",
author = "Shona Fielding and Peter Fayers and Craig Ramsay",
note = "Acknowledgments We would like to thank the Centre for Health Care Randomized Trials based within the Health Services Research Unit and their staff for providing the data used for this study. Particularly, Gladys McPherson, Alison McDonald, Graeme MacLennan, Jonathan Cook and Samantha Wileman who assisted with data queries and provided background to the trials. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health Directorate. Shona Fielding was funded by the Chief Scientist Office on a Research Training Fellowship (CZF/1/31) while carrying out this study. The views expressed are, however, not necessarily those of the funding body.",
year = "2010",
month = "8",
doi = "10.1177/1740774510374626",
language = "English",
volume = "7",
pages = "333--342",
journal = "Clinical Trials",
issn = "1740-7745",
publisher = "Sage Publications",
number = "4",

}

TY - JOUR

T1 - Predicting missing quality of life data that were later recovered

T2 - an empirical comparison of approaches

AU - Fielding, Shona

AU - Fayers, Peter

AU - Ramsay, Craig

N1 - Acknowledgments We would like to thank the Centre for Health Care Randomized Trials based within the Health Services Research Unit and their staff for providing the data used for this study. Particularly, Gladys McPherson, Alison McDonald, Graeme MacLennan, Jonathan Cook and Samantha Wileman who assisted with data queries and provided background to the trials. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health Directorate. Shona Fielding was funded by the Chief Scientist Office on a Research Training Fellowship (CZF/1/31) while carrying out this study. The views expressed are, however, not necessarily those of the funding body.

PY - 2010/8

Y1 - 2010/8

N2 - Background and Purpose: The aim was to compare simple imputation, multiple imputation, and modeling approaches to deal with ‘missing’ quality of life data. Data were obtained from five clinical trials, which employed a reminder system for follow-up questionnaires. Previous studies have compared imputation strategies by artificially removing data according to prespecified mechanisms. Our approach differs from previous study as actual collected data are utilized.Methods: Data obtained by reminder were initially treated as missing. These missing values were imputed using a variety of simple and multiple imputation strategies. The trials were analyzed using the imputed datasets, and the resulting treatment effects compared to analyses using the full dataset including responses following reminders. A repeated measures model was also carried out on the available data and the pattern mixture models were employed. The accuracy of the different strategies was assessed by calculating the bias seen in the calculated treatment difference compared to the actual observed treatment difference.Results: Baseline carried forward or last value carried forward were shown to be the best simple imputation methods in this setting. Multiple imputation using a regression model or predictive mean match model tended to provide treatment difference estimates with the least bias when compared to the actual observed data. Pattern mixture models did not perform well. Overall, the multiple imputation procedures were generally the least biased approaches.Limitations: A number of imputation and modeling procedures have been investigated but this list is not exhaustive. All the example datasets come from the same data source and perhaps studies from additional disease areas would have been useful. However, we feel the results are generalizable to other quality of life outcomes and clinical areas.Conclusions: Multiple imputation is recommended for missing quality of life data as it makes the assumption of missing at random which in the quality of life setting is more plausible than the assumption of missing completely at random for which most simple imputation methods are based. Pattern mixture models can be complex and did not perform well in this setting. Clinical Trials 2010; 7: 333—342. http://ctj.sagepub.com

AB - Background and Purpose: The aim was to compare simple imputation, multiple imputation, and modeling approaches to deal with ‘missing’ quality of life data. Data were obtained from five clinical trials, which employed a reminder system for follow-up questionnaires. Previous studies have compared imputation strategies by artificially removing data according to prespecified mechanisms. Our approach differs from previous study as actual collected data are utilized.Methods: Data obtained by reminder were initially treated as missing. These missing values were imputed using a variety of simple and multiple imputation strategies. The trials were analyzed using the imputed datasets, and the resulting treatment effects compared to analyses using the full dataset including responses following reminders. A repeated measures model was also carried out on the available data and the pattern mixture models were employed. The accuracy of the different strategies was assessed by calculating the bias seen in the calculated treatment difference compared to the actual observed treatment difference.Results: Baseline carried forward or last value carried forward were shown to be the best simple imputation methods in this setting. Multiple imputation using a regression model or predictive mean match model tended to provide treatment difference estimates with the least bias when compared to the actual observed data. Pattern mixture models did not perform well. Overall, the multiple imputation procedures were generally the least biased approaches.Limitations: A number of imputation and modeling procedures have been investigated but this list is not exhaustive. All the example datasets come from the same data source and perhaps studies from additional disease areas would have been useful. However, we feel the results are generalizable to other quality of life outcomes and clinical areas.Conclusions: Multiple imputation is recommended for missing quality of life data as it makes the assumption of missing at random which in the quality of life setting is more plausible than the assumption of missing completely at random for which most simple imputation methods are based. Pattern mixture models can be complex and did not perform well in this setting. Clinical Trials 2010; 7: 333—342. http://ctj.sagepub.com

U2 - 10.1177/1740774510374626

DO - 10.1177/1740774510374626

M3 - Article

VL - 7

SP - 333

EP - 342

JO - Clinical Trials

JF - Clinical Trials

SN - 1740-7745

IS - 4

ER -