Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small

被引:76
作者
Riley, Richard D. [1 ]
Snell, Kym I. E. [1 ]
Martin, Glen P. [2 ]
Whittle, Rebecca [1 ]
Archer, Lucinda [1 ]
Sperrin, Matthew [2 ]
Collins, Gary S. [3 ,4 ]
机构
[1] Keele Univ, Sch Med, Ctr Prognosis Res, Keele ST5 5BG, Staffs, England
[2] Univ Manchester, Manchester Acad Hlth Sci Ctr, Fac Biol Med & Hlth, Div Informat Imaging & Data Sci, Manchester, Lancs, England
[3] Univ Oxford, Nuffield Dept Orthopaed, Ctr Stat Med Rheumatol & Musculoskeletal Sci, Oxford OX3 7LD, England
[4] John Radcliffe Hosp, NIHR Oxford Biomed Res Ctr, Oxford OX3 9DU, England
关键词
Risk prediction models; Penalization; Shrinkage; Overfitting; Sample size; LOGISTIC-REGRESSION ANALYSIS; CALIBRATION; VALIDATION;
D O I
10.1016/j.jclinepi.2020.12.005
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objectives: When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (?tuning parameters?) are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance. Study Design and Setting: This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net. Results: In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model?s Cox-Snell R2 is low. The problem can lead to considerable miscalibration of model predictions in new individuals. Conclusion: Penalization methods are not a ?carte blanche?; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters. ? 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:88 / 96
页数:9
相关论文
共 29 条
[1]   An evaluation of penalised survival methods for developing prognostic models with rare events [J].
Ambler, G. ;
Seaman, S. ;
Omar, R. Z. .
STATISTICS IN MEDICINE, 2012, 31 (11-12) :1150-1161
[2]  
[Anonymous], 2009, CLIN PREDICTION MODE, DOI DOI 10.1007/978-0-387-77244-8
[3]  
[Anonymous], 2019, PROGNOSIS RES HEALTH
[4]   The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models [J].
Austin, Peter C. ;
Steyerberg, Ewout W. .
STATISTICS IN MEDICINE, 2019, 38 (21) :4051-4065
[5]   Sample size considerations for the external validation of a multivariable prognostic model: a resampling study [J].
Collins, Gary S. ;
Ogundimu, Emmanuel O. ;
Altman, Douglas G. .
STATISTICS IN MEDICINE, 2016, 35 (02) :214-226
[6]  
Collins GS, 2015, ANN INTERN MED, V162, P55, DOI [10.1186/s12916-014-0241-z, 10.7326/M14-0698, 10.1016/j.jclinepi.2014.11.010, 10.7326/M14-0697, 10.1016/j.eururo.2014.11.025, 10.1002/bjs.9736, 10.1038/bjc.2014.639, 10.1136/bmj.g7594, 10.1111/eci.12376]
[7]  
Copas J B, 1997, Stat Methods Med Res, V6, P167, DOI 10.1191/096228097667367976
[8]  
COPAS JB, 1983, J R STAT SOC B, V45, P311
[9]  
Cox D.R., 1989, Analysis of Binary Data, V2nd
[10]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22