A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department

被引:27
作者
Rahmatinejad, Zahra [1 ]
Dehghani, Toktam [1 ,2 ]
Hoseini, Benyamin [3 ]
Rahmatinejad, Fatemeh [1 ]
Lotfata, Aynaz [4 ]
Reihani, Hamidreza [5 ]
Eslami, Saeid [1 ,3 ,6 ]
机构
[1] Mashhad Univ Med Sci, Fac Med, Dept Med Informat, Mashhad, Iran
[2] Toos Inst Higher Educ, Mashhad, Iran
[3] Mashhad Univ Med Sci, Pharmaceut Technol Inst, Pharmaceut Res Ctr, Mashhad, Iran
[4] Univ Calif Davis, Sch Vet Med, Dept Pathol Microbiol & Immunol, Davis, CA USA
[5] Mashhad Univ Med Sci, Fac Med, Dept Emergency Med, Mashhad, Iran
[6] Univ Amsterdam, Amsterdam UMC, Locat AMC, Dept Med Microbiol, Amsterdam, Netherlands
关键词
Machine learning; Prognostic models; Ensemble models; In-hospital mortality; Emergency department; EXTERNAL VALIDATION; APACHE-II; MACHINE; SEVERITY; MODELS; SCORE;
D O I
10.1038/s41598-024-54038-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This study addresses the challenges associated with emergency department (ED) overcrowding and emphasizes the need for efficient risk stratification tools to identify high-risk patients for early intervention. While several scoring systems, often based on logistic regression (LR) models, have been proposed to indicate patient illness severity, this study aims to compare the predictive performance of ensemble learning (EL) models with LR for in-hospital mortality in the ED. A cross-sectional single-center study was conducted at the ED of Imam Reza Hospital in northeast Iran from March 2016 to March 2017. The study included adult patients with one to three levels of emergency severity index. EL models using Bagging, AdaBoost, random forests (RF), Stacking and extreme gradient boosting (XGB) algorithms, along with an LR model, were constructed. The training and validation visits from the ED were randomly divided into 80% and 20%, respectively. After training the proposed models using tenfold cross-validation, their predictive performance was evaluated. Model performance was compared using the Brier score (BS), The area under the receiver operating characteristics curve (AUROC), The area and precision-recall curve (AUCPR), Hosmer-Lemeshow (H-L) goodness-of-fit test, precision, sensitivity, accuracy, F1-score, and Matthews correlation coefficient (MCC). The study included 2025 unique patients admitted to the hospital's ED, with a total percentage of hospital deaths at approximately 19%. In the training group and the validation group, 274 of 1476 (18.6%) and 152 of 728 (20.8%) patients died during hospitalization, respectively. According to the evaluation of the presented framework, EL models, particularly Bagging, predicted in-hospital mortality with the highest AUROC (0.839, CI (0.802-0.875)) and AUCPR = 0.64 comparable in terms of discrimination power with LR (AUROC (0.826, CI (0.787-0.864)) and AUCPR = 0.61). XGB achieved the highest precision (0.83), sensitivity (0.831), accuracy (0.842), F1-score (0.833), and the highest MCC (0.48). Additionally, the most accurate models in the unbalanced dataset belonged to RF with the lowest BS (0.128). Although all studied models overestimate mortality risk and have insufficient calibration (P > 0.05), stacking demonstrated relatively good agreement between predicted and actual mortality. EL models are not superior to LR in predicting in-hospital mortality in the ED. Both EL and LR models can be considered as screening tools to identify patients at risk of mortality.
引用
收藏
页数:17
相关论文
共 76 条
[1]  
Al-Stouhi S, 2011, LECT NOTES ARTIF INT, V6911, P60, DOI 10.1007/978-3-642-23780-5_14
[2]   Decreasing Length of Stay in the Emergency Department With a Split Emergency Severity Index 3 Patient Flow Model [J].
Arya, Rajiv ;
Wei, Grant ;
McCoy, Jonathan V. ;
Crane, Jody ;
Ohman-Strickland, Pamela ;
Eisenstein, Robert M. .
ACADEMIC EMERGENCY MEDICINE, 2013, 20 (11) :1171-1179
[3]  
Atashi Alireza, 2018, J Innov Health Inform, V25, P71, DOI 10.14236/jhi.v25i2.953
[4]   Risk scoring systems for adults admitted to the emergency department: a systematic review [J].
Brabrand, Mikkel ;
Folkestad, Lars ;
Clausen, Nicola Groes ;
Knudsen, Torben ;
Hallas, Jesper .
SCANDINAVIAN JOURNAL OF TRAUMA RESUSCITATION & EMERGENCY MEDICINE, 2010, 18
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Predictive Ability of the National Early Warning Score in Mortality Prediction of Acute Pulmonary Embolism in the Southeast Asian Population [J].
Bumroongkit, Chaiwat ;
Tajarernmuang, Pattraporn ;
Trongtrakul, Konlawij ;
Liwsrisakun, Chalerm ;
Deesomchok, Athavudh ;
Pothirat, Chaicharn ;
Theerakittikul, Theerakorn ;
Limsukon, Atikun ;
Niyatiwatchanchai, Nutchanok ;
Inchai, Juthamas ;
Chaiwong, Warawut .
JOURNAL OF CARDIOVASCULAR DEVELOPMENT AND DISEASE, 2023, 10 (02)
[7]   Predicting in-hospital mortality after traumatic brain injury: External validation of CRASH-basic and IMPACT-core in the national trauma data bank [J].
Camarano, Joseph G. ;
Ratliff, Hunter T. ;
Korst, Genevieve S. ;
Hrushka, Jaron M. ;
Jupiter, Daniel C. .
INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2021, 52 (02) :147-153
[8]  
Chen T., 2015, Xgboost: extreme gradient boosting, P1
[9]   Using Machine Learning to Predict ICU Transfer in Hospitalized COVID-19 Patients [J].
Cheng, Fu-Yuan ;
Joshi, Himanshu ;
Tandon, Pranai ;
Freeman, Robert ;
Reich, David L. ;
Mazumdar, Madhu ;
Kohli-Seth, Roopa ;
Levin, Matthew A. ;
Timsina, Prem ;
Kia, Arash .
JOURNAL OF CLINICAL MEDICINE, 2020, 9 (06)
[10]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)