Calculating the sample size required for developing a clinical prediction model

被引:1220
作者
Riley, Richard D. [1 ]
Ensor, Joie [1 ]
Snell, Kym I. E. [1 ]
Harrell, Frank E., Jr. [2 ]
Martin, Glen P. [3 ]
Reitsma, Johannes B. [4 ]
Moons, Karel G. M. [4 ]
Collins, Gary [5 ]
van Smeden, Maarten [5 ,6 ]
机构
[1] Keele Univ, Sch Primary Community & Social Care, Ctr Prognosis Res, Keele ST5 5BG, Staffs, England
[2] Vanderbilt Univ, Sch Med, Dept Biostat, Nashville, TN 37212 USA
[3] Univ Manchester, Fac Biol Med & Hlth, Div Informat Imaging & Data Sci, Manchester, Lancs, England
[4] Univ Med Ctr Utrecht, Julius Ctr Hlth Sci, Utrecht, Netherlands
[5] Univ Oxford, Nuffield Dept Orthopaed Rheumatol & Musculoskelet, Ctr Stat Med, Oxford, England
[6] Leiden Univ, Med Ctr, Dept Clin Epidemiol, Leiden, Netherlands
来源
BMJ-BRITISH MEDICAL JOURNAL | 2020年 / 368卷
关键词
EXTERNAL VALIDATION; STATISTICAL POWER; REGRESSION-MODELS; PROGNOSTIC MODEL; EVENTS; RISK; NUMBER; SELECTION; ACCURACY; APPLICABILITY;
D O I
10.1136/bmj.m441
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Clinical prediction models aim to predict outcomes in individuals, to inform diagnosis or prognosis in healthcare. Hundreds of prediction models are published in the medical literature each year, yet many are developed using a dataset that is too small for the total number of participants or outcome events. This leads to inaccurate predictions and consequently incorrect healthcare decisions for some individuals. In this article, the authors provide guidance on how to calculate the sample size required to develop a clinical prediction model.
引用
收藏
页数:12
相关论文
共 81 条
  • [1] CARDIOVASCULAR-DISEASE RISK PROFILES
    ANDERSON, KM
    ODELL, PM
    WILSON, PWF
    KANNEL, WB
    [J]. AMERICAN HEART JOURNAL, 1991, 121 (01) : 293 - 298
  • [2] [Anonymous], 2017, BMJ
  • [3] Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models
    Austin, Peter C.
    Steyerberg, Ewout W.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2017, 26 (02) : 796 - 808
  • [4] The number of subjects per variable required in linear regression analyses
    Austin, Peter C.
    Steyerberg, Ewout W.
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2015, 68 (06) : 627 - 636
  • [5] Sample size considerations for the external validation of a multivariable prognostic model: a resampling study
    Collins, Gary S.
    Ogundimu, Emmanuel O.
    Altman, Douglas G.
    [J]. STATISTICS IN MEDICINE, 2016, 35 (02) : 214 - 226
  • [6] Collins GS, 2015, J CLIN EPIDEMIOL, V68, P112, DOI [10.1038/bjc.2014.639, 10.7326/M14-0697, 10.1016/j.eururo.2014.11.025, 10.1016/j.jclinepi.2014.11.010, 10.1136/bmj.g7594, 10.1002/bjs.9736, 10.1186/s12916-014-0241-z, 10.7326/M14-0698]
  • [7] Importance of events per independent variable in proportional hazards analysis .1. Background, goals, and general strategy
    Concato, J
    Peduzzi, P
    Holford, TR
    Feinstein, AR
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 1995, 48 (12) : 1495 - 1501
  • [8] Copas J B, 1997, Stat Methods Med Res, V6, P167, DOI 10.1191/096228097667367976
  • [9] COPAS JB, 1983, J R STAT SOC B, V45, P311
  • [10] Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure
    Courvoisier, Delphine S.
    Combescure, Christophe
    Agoritsas, Thomas
    Gayet-Ageron, Angele
    Perneger, Thomas V.
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2011, 64 (09) : 993 - 1000