Comparison of machine learning and the regression-based EHMRG model for predicting early mortality in acute heart failure

被引:17
作者
Austin, David E. [1 ]
Lee, Douglas S. [1 ,2 ,3 ,4 ,9 ]
Wang, Chloe X. [1 ,7 ]
Ma, Shihao [1 ,5 ,6 ]
Wang, Xuesong [1 ]
Porter, Joan [1 ]
Wang, Bo [1 ,2 ,3 ,5 ,6 ,7 ,8 ]
机构
[1] Inst Clin Evaluat Sci, ICES, 2075 Bayview Ave, Toronto, ON M4N 3M5, Canada
[2] Univ Hlth Network, Peter Munk Cardiac Ctr, 585 Univ Ave, Toronto, ON M5G 2N2, Canada
[3] Univ Hlth Network, Joint Dept Med Imaging, 585 Univ Ave, Toronto, ON M5G 2N2, Canada
[4] Ted Rogers Ctr Heart Res, 661 Univ Ave, Toronto, ON M5G1X8, Canada
[5] Univ Toronto, Dept Comp Sci, 40 St George St, Toronto, ON M5S2E4, Canada
[6] Vector Inst Artificial Intelligence, 661 Univ Ave,Suite 710, Toronto, ON M5G1M1, Canada
[7] Univ Hlth Network, Div Vasc Surg, 190 Elizabeth St, Toronto, ON M5G2C4, Canada
[8] Univ Toronto, Dept Lab Med & Pathobiol, Toronto, ON, Canada
[9] Univ Hlth Network, Peter Munk Cardiac Ctr, Div Cardiol, ICES,Ted Rogers Chair Heart Funct Outcomes,Med, Toronto, ON, Canada
基金
加拿大健康研究院;
关键词
Heart failure; Machine learning; Statistical models; Prognosis; Outcomes; Mortality; CARE; CLASSIFICATION; SIMULATION; EVENTS;
D O I
10.1016/j.ijcard.2022.07.035
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: Although risk stratification of patients with acute decompensated heart failure (HF) is important, it is unknown whether machine learning (ML) or conventional statistical models are optimal. We developed ML algorithms to predict 7-day and 30-day mortality in patients with acute HF and compared these with an existing logistic regression model at the same timepoints.Methods: Patients presenting to one of 86 hospitals, who were either admitted to hospital or discharged home directly from the emergency department, were randomly selected using stratified random sampling. ML ap-proaches, including neural networks, random forest, XGBoost, and the Lasso, were compared with a validated logistic regression model for discrimination and calibration.Results: Among 12,608 patients in our analysis, lasso regression (c-statistic 0.774; 95% CI, 0.743, 0.806) per-formed better than other ML models for 7-day mortality but did not outperform the baseline logistic regression model (0.794; 95% CI, 0.789, 0.800). For 30-day mortality, XGBoost performed better than other ML models (c -statistic 0.759; 95% CI; 0.740, 0.779), but was not significantly better than logistic regression (c-statistic 0.755; 95% CI, 0.750, 0.762). Logistic regression demonstrated better calibration at 7 days (calibration-in-the-large 0.017; 95% CI,-0.657, 0.692, and calibration slope 0.954; 95% CI, 0.769, 1.139), and at 30 days (-0.026; 95% CI,-0.374, 0.322, and 0.964; 95% CI, 0.831, 1.098), and best Brier scores, compared to ML approaches.Conclusions: Logistic regression was comparable to ML in discrimination, but was superior to ML algorithms in calibration overall. ML algorithms for prognosis should routinely report calibration metrics in addition to discrimination.
引用
收藏
页码:78 / 84
页数:7
相关论文
共 37 条
[1]   AN IMPROVED ALGORITHM FOR NEURAL-NETWORK CLASSIFICATION OF IMBALANCED TRAINING SETS [J].
ANAND, R ;
MEHROTRA, KG ;
MOHAN, CK ;
RANKA, S .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1993, 4 (06) :962-969
[2]   Health Care Use Before First Heart Failure Hospitalization Identifying Opportunities to Pre-Emptively Diagnose Impending Decompensation [J].
Anderson, Kim ;
Ross, Heather J. ;
Austin, Peter C. ;
Fang, Jiming ;
Lee, Douglas S. .
JACC-HEART FAILURE, 2020, 8 (12) :1024-1034
[3]   Machine Learning Prediction of Mortality and Hospitalization in Heart Failure With Preserved Ejection Fraction [J].
Angraal, Suveen ;
Mortazavi, Bobak J. ;
Gupta, Aakriti ;
Khera, Rohan ;
Ahmad, Tariq ;
Desai, Nihar R. ;
Jacoby, Daniel L. ;
Masoudi, Frederick A. ;
Spertus, John A. ;
Krumholz, Harlan M. .
JACC-HEART FAILURE, 2020, 8 (01) :12-21
[4]  
[Anonymous], 2009, The elements of statistical learning
[5]   Acute heart failure [J].
Arrigo, Mattia ;
Jessup, Mariell ;
Mullens, Wilfried ;
Reza, Nosheen ;
Shah, Ajay M. ;
Sliwa, Karen ;
Mebazaa, Alexandre .
NATURE REVIEWS DISEASE PRIMERS, 2020, 6 (01)
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[8]   Clinical Research Machine Learning Compared With Conventional Statistical Models for Predicting Myocardial Infarction Readmission and Mortality: A Systematic Review [J].
Cho, Sung Min ;
Austin, Peter C. ;
Ross, Heather J. ;
Abdel-Qadir, Husam ;
Chicco, Davide ;
Tomlinson, George ;
Taheri, Cameron ;
Foroutan, Farid ;
Lawler, Patrick R. ;
Billia, Filio ;
Gramolini, Anthony ;
Epelman, Slava ;
Wang, Bo ;
Lee, Douglas S. .
CANADIAN JOURNAL OF CARDIOLOGY, 2021, 37 (08) :1207-1214
[9]   A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models [J].
Christodoulou, Evangelia ;
Ma, Jie ;
Collins, Gary S. ;
Steyerberg, Ewout W. ;
Verbakel, Jan Y. ;
Van Calster, Ben .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2019, 110 :12-22
[10]   Assessing Risk and Preventing 30-Day Readmissions in Decompensated Heart Failure: Opportunity to Intervene? [J].
Dunbar-Yaffe R. ;
Stitt A. ;
Lee J.J. ;
Mohamed S. ;
Lee D.S. .
Current Heart Failure Reports, 2015, 12 (5) :309-317