Abstract
BACKGROUND: The assessment of calibration performance of risk prediction models based on regression or more flexible machine learning algorithms receives little attention.
MAIN TEXT: Herein, we argue that this needs to change immediately because poorly calibrated algorithms can be misleading and potentially harmful for clinical decision-making. We summarize how to avoid poor calibration at algorithm development and how to assess calibration at algorithm validation, emphasizing balance between model complexity and the available sample size. At external validation, calibration curves require sufficiently large samples. Algorithm updating should be considered for appropriate support of clinical practice.
CONCLUSION: Efforts are required to avoid poor calibration when developing prediction models, to evaluate calibration when validating models, and to update models when indicated. The ultimate aim is to optimize the utility of predictive analytics for shared decision-making and patient counseling.
Original language | English |
---|---|
Article number | 230 |
Journal | BMC medicine |
Volume | 17 |
Early online date | 16 Dec 2019 |
DOIs | |
Publication status | Published - 16 Dec 2019 |
Bibliographical note
AcknowledgementsThis work was developed as part of the international STRengthening Analytical Thinking for Observational Studies (STRATOS) initiative. The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies (http://stratos-initiative.org/). Members of the STRATOS Topic Group ‘Evaluating diagnostic tests and prediction models’ are (alphabetically) Patrick Bossuyt, Gary S. Collins, Petra Macaskill, David J. McLernon, Karel G.M. Moons, Ewout W. Steyerberg, Ben Van Calster, Maarten van Smeden, and Andrew Vickers.
Funding
This work was funded by the Research Foundation – Flanders (FWO; grant G0B4716N) and Internal Funds KU Leuven (grant C24/15/037). The funders had no role in study design, data collection, data analysis, interpretation of results, or writing of the manuscript.
Contributions
All authors conceived of the study. BVC drafted the manuscript. All authors reviewed and edited the manuscript and approved the final version.
Keywords
- Calibration
- Risk prediction models
- Predictive analytics
- Overfitting
- Heterogeneity
- Model performance