Comparison of Machine Learning Algorithms for Predicting Hospital Readmissions and Worsening Heart Failure Events in Patients With Heart Failure With Reduced Ejection Fraction: Modeling Study

被引：16

作者：

Ru, Boshu ^{[1
]}

Tan, Xi ^{[1
]}

Liu, Yu ^{[1
]}

Kannapur, Kartik ^{[2
]}

Ramanan, Dheepan ^{[2
]}

Kessler, Garin ^{[2
,3
]}

Lautsch, Dominik ^{[1
]}

Fonarow, Gregg ^{[4
,5
]}

机构：

[1] Merck & Co Inc, Rahway, NJ USA

[2] Amazon Web Serv Inc, Seattle, WA USA

[3] Georgetown Univ, Sch Continuing Studies, Washington, DC USA

[4] Univ Calif Los Angeles, Ahmanson UCLA Cardiomyopathy Ctr, Los Angeles, CA USA

[5] Univ Calif Los Angeles, Ahmanson UCLA Cardiomyopathy Ctr, 10833 LeConte Ave, Los Angeles, CA 90095 USA

来源：

JMIR FORMATIVE RESEARCH | 2023年 / 7卷

关键词：

deep learning; machine learning; hospital readmission; heart failure; heart failure with reduced ejection fraction; worsening heart failure event; Bidirectional Encoder Representations From Transformers; BERT; clinical registry; medical claims; real-world data;

D O I：

10.2196/41775

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

摘要：

Background: Heart failure (HF) is highly prevalent in the United States. Approximately one-third to one-half of HF cases are categorized as HF with reduced ejection fraction (HFrEF). Patients with HFrEF are at risk of worsening HF, have a high risk of adverse outcomes, and experience higher health care use and costs. Therefore, it is crucial to identify patients with HFrEF who are at high risk of subsequent events after HF hospitalization.Objective: Machine learning (ML) has been used to predict HF-related outcomes. The objective of this study was to compare different ML prediction models and feature construction methods to predict 30-, 90-, and 365-day hospital readmissions and worsening HF events (WHFEs).Methods: We used the Veradigm PINNACLE outpatient registry linked to Symphony Health's Integrated Dataverse data from July 1, 2013, to September 30, 2017. Adults with a confirmed diagnosis of HFrEF and HF-related hospitalization were included. WHFEs were defined as HF-related hospitalizations or outpatient intravenous diuretic use within 1 year of the first HF hospitalization. We used different approaches to construct ML features from clinical codes, including frequencies of clinical classification software (CCS) categories, Bidirectional Encoder Representations From Transformers (BERT) trained with CCS sequences (BERT + CCS), BERT trained on raw clinical codes (BERT + raw), and prespecified features based on clinical knowledge. A multilayer perceptron neural network, extreme gradient boosting (XGBoost), random forest, and logistic regression prediction models were applied and compared.Results: A total of 30,687 adult patients with HFrEF were included in the analysis; 11.41% (3184/27,917) of adults experienced a hospital readmission within 30 days of their first HF hospitalization, and nearly half (9231/21,562, 42.81%) of the patients experienced at least 1 WHFE within 1 year after HF hospitalization. The prediction models and feature combinations with the best area under the receiver operating characteristic curve (AUC) for each outcome were XGBoost with CCS frequency (AUC=0.595) for 30-day readmission, random forest with CCS frequency (AUC=0.630) for 90-day readmission, XGBoost with CCS frequency (AUC=0.649) for 365-day readmission, and XGBoost with CCS frequency (AUC=0.640) for WHFEs. Our ML models could discriminate between readmission and WHFE among patients with HFrEF. Our model performance was mediocre, especially for the 30-day readmission events, most likely owing to limitations of the data, including an imbalance between positive and negative cases and high missing rates of many clinical variables and outcome definitions.Conclusions: We predicted readmissions and WHFEs after HF hospitalizations in patients with HFrEF. Features identified by data-driven approaches may be comparable with those identified by clinical domain knowledge. Future work may be warranted to validate and improve the models using more longitudinal electronic health records that are complete, are comprehensive, and have a longer follow-up time.(JMIR Form Res 2023;7:e41775) doi: 10.2196/41775

引用

页数：17

共 51 条

[11] Trends in Heart Failure Hospitalizations in the US from 2008 to 2018 [J].

Clark, Katherine A. A. ;

Reinhardt, Samuel W. ;

Chouairi, Fouad ;

Miller, P. Elliott ;

Kay, Bradley ;

Fuery, Michael ;

Guha, Avirup ;

Ahmad, Tariq ;

Desai, Nihar R. .

JOURNAL OF CARDIAC FAILURE, 2022, 28 (02) :171-180

[12]

cvquality.acc, PARTN REG

[13] PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations [J].

Denny, Joshua C. ;

Ritchie, Marylyn D. ;

Basford, Melissa A. ;

Pulley, Jill M. ;

Bastarache, Lisa ;

Brown-Gentry, Kristin ;

Wang, Deede ;

Masys, Dan R. ;

Roden, Dan M. ;

Crawford, Dana C. .

BIOINFORMATICS, 2010, 26 (09) :1205-1210

[14]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[15] Prediction of 30-Day All-Cause Readmissions in Patients Hospitalized for Heart Failure Comparison of Machine Learning and Other Statistical Approaches [J].

Frizzell, Jarrod D. ;

Liang, Li ;

Schulte, Phillip J. ;

Yancy, Clyde W. ;

Heidenreich, Paul A. ;

Hernandez, Adrian F. ;

Bhatt, Deepak L. ;

Fonarow, Gregg C. ;

Laskey, Warren K. .

JAMA CARDIOLOGY, 2017, 2 (02) :204-209

[16]

Goodfellow I, 2016, ADAPT COMPUT MACH LE, P95

[17]

Gopukumar Deepika, 2022, JMIR Med Inform, V10, pe37578, DOI 10.2196/37578

[18]

Hastie T., 2009, The Elements of Statistical Learning, Vvol 2

[19]

hcup-us.ahrq, SOFTW TOOLS

[20] ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning [J].

He, Haibo ;

Bai, Yang ;

Garcia, Edwardo A. ;

Li, Shutao .

2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, :1322-1328

← 1 2 3 4 5 6 →