Bootstrap methods for developing predictive models

被引:486
作者
Austin, PC
Tu, JV
机构
[1] Inst Clin Evaluat Sci, Toronto, ON M4N 3M5, Canada
[2] Univ Toronto, Dept Publ Hlth Sci, Toronto, ON, Canada
[3] Univ Toronto, Dept Hlth Policy Management & Evaluat, Toronto, ON, Canada
[4] Inst Clin Evaluat Sci, Toronto, ON, Canada
[5] Sunnybrook & Womens Coll, Hlth Sci Ctr, Div Gen Internal Med, Toronto, ON, Canada
基金
加拿大健康研究院;
关键词
acute myocardial infarction; epidemiological research; mortality; multivariate analysis; regression models; variable selection;
D O I
10.1198/0003130043277
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Researchers frequently use automated model selection methods such as backwards elimination to identify variables that are independent predictors of an outcome under consideration. We propose using bootstrap resampling in conjunction with automated variable selection methods to develop parsimonious prediction models. Using data on patients admitted to hospital with a heart attack, we demonstrate that selecting those variables that were identified as independent predictors of mortality in at least 60% of the bootstrap samples resulted in a parsimonious model with excellent predictive ability.
引用
收藏
页码:131 / 137
页数:7
相关论文
共 28 条
  • [1] AUSTIN PC, IN PRESS J CLIN EPID
  • [2] Predictors of outcome in patients with acute coronary syndromes without persistent ST-segment elevation results from an international trial of 9461 patients
    Boersma, E
    Pieper, KS
    Steyerberg, EW
    Wilcox, RG
    Chang, WC
    Lee, KL
    Akkerhuis, KM
    Harrington, RA
    Deckers, JW
    Armstrong, PW
    Lincoff, AM
    Califf, RM
    Topol, EJ
    Simoons, ML
    [J]. CIRCULATION, 2000, 101 (22) : 2557 - 2567
  • [3] COPAS JB, 1991, STATISTICIAN, V40, P51
  • [4] Davison A. C., 1997, BOOTSTRAP METHODS TH, DOI 10.1017/CBO9780511802843
  • [5] COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH
    DELONG, ER
    DELONG, DM
    CLARKEPEARSON, DI
    [J]. BIOMETRICS, 1988, 44 (03) : 837 - 845
  • [6] BACKWARD, FORWARD AND STEPWISE AUTOMATED SUBSET-SELECTION ALGORITHMS - FREQUENCY OF OBTAINING AUTHENTIC AND NOISE VARIABLES
    DERKSEN, S
    KESELMAN, HJ
    [J]. BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1992, 45 : 265 - 282
  • [7] Efron B., 1993, INTRO BOOTSTRAP, DOI 10.1007/978-1-4899-4541-9
  • [8] Ennis M, 1998, STAT MED, V17, P2501
  • [9] FREQUENCY OF SELECTING NOISE VARIABLES IN SUBSET REGRESSION-ANALYSIS - A SIMULATION STUDY
    FLACK, VF
    CHANG, PC
    [J]. AMERICAN STATISTICIAN, 1987, 41 (01) : 84 - 86
  • [10] THE MEANING AND USE OF THE AREA UNDER A RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE
    HANLEY, JA
    MCNEIL, BJ
    [J]. RADIOLOGY, 1982, 143 (01) : 29 - 36