External validation of clinical prediction models: simulation-based sample size calculations were more reliable than rules-of-thumb

被引:69
作者
Snell, Kym I. E. [1 ]
Archer, Lucinda [1 ]
Ensor, Joie [1 ]
Bonnett, Laura J. [2 ]
Debray, Thomas P. A. [3 ]
Phillips, Bob [4 ]
Collins, Gary S. [5 ,6 ]
Riley, Richard D. [1 ]
机构
[1] Keele Univ, Ctr Prognosis Res, Sch Med, Keele, Staffs, England
[2] Univ Liverpool, Dept Biostat, Liverpool, Merseyside, England
[3] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[4] Univ York, Ctr Reviews & Disseminat, York, N Yorkshire, England
[5] Univ Oxford, Ctr Stat Med, Nuffield Dept Orthopaed Rheumatol & Musculoskelet, Oxford, England
[6] John Radcliffe Hosp, NIHR Oxford Biomed Res Ctr, Oxford, England
基金
英国医学研究理事会;
关键词
Sample size; External validation; Clinical prediction model; Calibration and discrimination; Net benefit; Simulation; CALIBRATION;
D O I
10.1016/j.jclinepi.2021.02.011
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Introduction: Sample size "rules-of-thumb" for external validation of clinical prediction models suggest at least 100 events and 100 non-events. Such blanket guidance is imprecise, and not specific to the model or validation setting. We investigate factors affecting precision of model performance estimates upon external validation, and propose a more tailored sample size approach. Methods: Simulation of logistic regression prediction models to investigate factors associated with precision of performance estimates. Then, explanation and illustration of a simulation-based approach to calculate the minimum sample size required to precisely estimate a model's calibration, discrimination and clinical utility. Results: Precision is affected by the model's linear predictor (LP) distribution, in addition to number of events and total sample size. Sample sizes of 100 (or even 200) events and non-events can give imprecise estimates, especially for calibration. The simulationbased calculation accounts for the LP distribution and (mis)calibration in the validation sample. Application identifies 2430 required participants (531 events) for external validation of a deep vein thrombosis diagnostic model. Conclusion: Where researchers can anticipate the distribution of the model's LP (eg, based on development sample, or a pilot study), a simulation-based approach for calculating sample size for external validation offers more flexibility and reliability than rules-of-thumb. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
引用
收藏
页码:79 / 89
页数:11
相关论文
共 28 条
[1]  
[Anonymous], 2018, STAT METHODS MED RES
[2]  
[Anonymous], 2019, PROGNOSIS RES HEALTH
[3]   The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models [J].
Austin, Peter C. ;
Steyerberg, Ewout W. .
STATISTICS IN MEDICINE, 2019, 38 (21) :4051-4065
[4]   Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers [J].
Austin, Peter C. ;
Steyerberg, Ewout W. .
STATISTICS IN MEDICINE, 2014, 33 (03) :517-535
[5]   Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable [J].
Austin, Peter C. ;
Steyerberg, Ewout W. .
BMC MEDICAL RESEARCH METHODOLOGY, 2012, 12
[6]   Reporting and Methods in Clinical Prediction Research: A Systematic Review [J].
Bouwmeester, Walter ;
Zuithoff, Nicolaas P. A. ;
Mallett, Susan ;
Geerlings, Mirjam I. ;
Vergouwe, Yvonne ;
Steyerberg, Ewout W. ;
Altman, Douglas G. ;
Moons, Karel G. M. .
PLOS MEDICINE, 2012, 9 (05)
[7]   Sample size considerations for the external validation of a multivariable prognostic model: a resampling study [J].
Collins, Gary S. ;
Ogundimu, Emmanuel O. ;
Altman, Douglas G. .
STATISTICS IN MEDICINE, 2016, 35 (02) :214-226
[8]   External validation of multivariable prediction models: a systematic review of methodological conduct and reporting [J].
Collins, Gary S. ;
de Groot, Joris A. ;
Dutton, Susan ;
Omar, Omar ;
Shanyinde, Milensu ;
Tajar, Abdelouahid ;
Voysey, Merryn ;
Wharton, Rose ;
Yu, Ly-Mee ;
Moons, Karel G. ;
Altman, Douglas G. .
BMC MEDICAL RESEARCH METHODOLOGY, 2014, 14
[9]   A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes [J].
Debray, Thomas P. A. ;
Damen, Johanna A. A. G. ;
Riley, Richard D. ;
Snell, Kym ;
Reitsma, Johannes B. ;
Hooft, Lotty ;
Collins, Gary S. ;
Moons, Karel G. M. .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2019, 28 (09) :2768-2786
[10]   A guide to systematic review and meta-analysis of prediction model performance [J].
Debray, Thomas P. A. ;
Damen, Johanna A. A. G. ;
Snell, Kym I. E. ;
Ensor, Joie ;
Hooft, Lotty ;
Reitsma, Johannes B. ;
Riley, Richard D. ;
Moons, Karel G. M. .
BMJ-BRITISH MEDICAL JOURNAL, 2017, 356