Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited

被引:686
作者
Kramer, Andrew A. [1 ]
Zimmerman, Jack E.
机构
[1] Cerner Corp, Vienna, VA, Austria
[2] Washington Univ, Washington, DC USA
关键词
intensive care; patient outcome assessment; predictive models; hospital mortality; Hosmer-Lemeshow statistic; logistic regression;
D O I
10.1097/01.CCM.0000275267.64078.B0
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
Objective: To examine the Hosmer-Lemeshow test's sensitivity in evaluating the calibration of models predicting hospital mortality in large critical care populations. Design: Simulation study. Setting: Intensive care unit databases used for predictive modeling. Patients: Data sets were simulated representing the approximate number of patients used in earlier versions of critical care predictive models (n = 5,000 and 10,000) and more recent predictive models (n = 50,000). Each patient had a hospital mortality probability generated as a function of 23 risk variables. Interventions: None. Measurements and Main Results: Data sets of 5,000, 10,000, and 50,000 patients were replicated 1,000 times. Logistic regression models were evaluated for each simulated data set. This process was initially carried out under conditions of perfect fit (observed mortality = predicted mortality; standardized mortality ratio = 1.000) and repeated with an observed mortality that differed slightly (0.4%) from predicted mortality. Under conditions of perfect fit, the Hosmer-Lemeshow test was not influenced by the number of patients in the data set. In situations where there was a slight deviation from perfect fit, the Hosmer-Lemeshow test was sensitive to sample size. For populations of 5,000 patients, 10% of the Hosmer-Lemeshow tests were significant at p <.05, whereas for 10,000 patients 34% of the Hosmer-Lemeshow tests were significant at p <.05. When the number of patients matched contemporary studies (i.e., 50,000 patients), the Hosmer-Lemeshow test was statistically significant in 100% of the models. Conclusions: Caution should be used in interpreting the calibration of predictive models developed using a smaller data set when applied to larger numbers of patients. A significant Hosmer-Lemeshow test does not necessarily mean that a predictive model is not useful or suspect. While decisions concerning a mortality model's suitability should include the Hosmer-Lemeshow test, additional information needs to be taken into consideration. This includes the overall number of patients, the observed and predicted probabilities within each decile, and adjunct measures of model calibration.
引用
收藏
页码:2052 / 2056
页数:5
相关论文
共 18 条
  • [1] The impact of low-risk intensive care unit admissions on mortality probabilities by SAPS II, APACHE II and APACHE III
    Beck, DH
    Smith, GB
    Taylor, BL
    [J]. ANAESTHESIA, 2002, 57 (01) : 21 - 26
  • [2] Bertolini G, 2000, J Epidemiol Biostat, V5, P251
  • [3] Effect of mortality rate on the performance of the Acute Physiology and Chronic Health Evaluation II: A simulation study
    Glance, LG
    Osler, TM
    Papadakos, P
    [J]. CRITICAL CARE MEDICINE, 2000, 28 (10) : 3424 - 3428
  • [4] Recalibration of risk prediction models in a large multicenter cohort of admissions to adult, general critical care units in the United Kingdom
    Harrison, DA
    Brady, AR
    Parry, GJ
    Carpenter, JR
    Rowan, K
    [J]. CRITICAL CARE MEDICINE, 2006, 34 (05) : 1378 - 1388
  • [5] Updated Mortality Probability Model (MPM-III)
    Higgins, TL
    Teres, D
    Copes, W
    Nathanson, B
    Stark, L
    Kramer, A
    [J]. CHEST, 2005, 128 (04) : 348S - 348S
  • [6] Hosmer DW, 1997, STAT MED, V16, P965
  • [7] Hosmer W., 2000, Applied Logistic Regression, VSecond
  • [8] IEZZONI LI, 1980, COMMUNICATIONS STAT, V10, P1043
  • [9] THE APACHE-III PROGNOSTIC SYSTEM - RISK PREDICTION OF HOSPITAL MORTALITY FOR CRITICALLY ILL HOSPITALIZED ADULTS
    KNAUS, WA
    WAGNER, DP
    DRAPER, EA
    ZIMMERMAN, JE
    BERGNER, M
    BASTOS, PG
    SIRIO, CA
    MURPHY, DJ
    LOTRING, T
    DAMIANO, A
    HARRELL, FE
    [J]. CHEST, 1991, 100 (06) : 1619 - 1636
  • [10] A NEW SIMPLIFIED ACUTE PHYSIOLOGY SCORE (SAPS-II) BASED ON A EUROPEAN NORTH-AMERICAN MULTICENTER STUDY
    LEGALL, JR
    LEMESHOW, S
    SAULNIER, F
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1993, 270 (24): : 2957 - 2963