COVID-19 Mortality Prediction From Deep Learning in a Large Multistate Electronic Health Record and Laboratory Information System Data Set: Algorithm Development and Validation

被引:24
作者
Sankaranarayanan, Saranya [1 ]
Balan, Jagadheshwar [1 ]
Walsh, Jesse R. [1 ]
Wu, Yanhong [1 ]
Minnich, Sara [1 ]
Piazza, Amy [1 ]
Osborne, Collin [1 ]
Oliver, Gavin R. [1 ]
Lesko, Jessica [1 ]
Bates, Kathy L. [1 ]
Khezeli, Kia [1 ]
Block, Darci R. [1 ]
DiGuardo, Margaret [1 ]
Kreuter, Justin [1 ]
O'Horo, John C. [1 ]
Kalantari, John [1 ]
Klee, Eric W. [1 ]
Salama, Mohamed E. [1 ]
Kipp, Benjamin [1 ]
Morice, William G. [1 ]
Jenkinson, Garrett [1 ]
机构
[1] Mayo Clin, 200 1st St SW, Rochester, MN 55905 USA
关键词
COVID-19; mortality; prediction; recurrent neural networks; missing data; time series; deep learning; machine learning; neural network; electronic health record; EHR; algorithm; development; validation;
D O I
10.2196/30157
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: COVID-19 is caused by the SARS-CoV-2 virus and has strikingly heterogeneous clinical manifestations, with most individuals contracting mild disease but a substantial minority experiencing fulminant cardiopulmonary symptoms or death. The clinical covariates and the laboratory tests performed on a patient provide robust statistics to guide clinical treatment. Deep learning approaches on a data set of this nature enable patient stratification and provide methods to guide clinical treatment. Objective: Here, we report on the development and prospective validation of a state-of-the-art machine learning model to provide mortality prediction shortly after confirmation of SARS-CoV-2 infection in the Mayo Clinic patient population. Methods: We retrospectively constructed one of the largest reported and most geographically diverse laboratory information system and electronic health record of COVID-19 data sets in the published literature, which included 11,807 patients residing in 41 states of the United States of America and treated at medical sites across 5 states in 3 time zones. Traditional machine learning models were evaluated independently as well as in a stacked learner approach by using AutoGluon, and various recurrent neural network architectures were considered. The traditional machine learning models were implemented using the AutoGluon-Tabular framework, whereas the recurrent neural networks utilized the TensorFlow Keras framework. We trained these models to operate solely using routine laboratory measurements and clinical covariates available within 72 hours of a patient's first positive COVID-19 nucleic acid test result. Results: The GRU-D recurrent neural network achieved peak cross-validation performance with 0.938 (SE 0.004) as the area under the receiver operating characteristic (AUROC) curve. This model retained strong performance by reducing the follow-up time to 12 hours (0.916 [SE 0.005] AUROC), and the leave-one-out feature importance analysis indicated that the most independently valuable features were age, Charlson comorbidity index, minimum oxygen saturation, fibrinogen level, and serum iron level. In the prospective testing cohort, this model provided an AUROC of 0.901 and a statistically significant difference in survival (P<.001, hazard ratio for those predicted to survive, 95% CI 0.043-0.106). Conclusions: Our deep learning approach using GRU-D provides an alert system to flag mortality for COVID-19-positive patients by using clinical covariates and laboratory values within a 72-hour window after the first positive nucleic acid test result.
引用
收藏
页数:16
相关论文
共 23 条
[1]   Prognostic Modeling of COVID-19 Using Artificial Intelligence in the United Kingdom: Model Development and Validation [J].
Abdulaal, Ahmed ;
Patel, Aatish ;
Charani, Esmita ;
Denny, Sarah ;
Mughal, Nabeela ;
Moore, Luke .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (08)
[2]   The proximal origin of SARS-CoV-2 [J].
Andersen, Kristian G. ;
Rambaut, Andrew ;
Lipkin, W. Ian ;
Holmes, Edward C. ;
Garry, Robert F. .
NATURE MEDICINE, 2020, 26 (04) :450-452
[3]   A Machine Learning Prediction Model of Respiratory Failure Within 48 Hours of Patient Admission for COVID-19: Model Development and Validation [J].
Bolourani, Siavash ;
Brenner, Max ;
Wang, Ping ;
McGinn, Thomas ;
Hirsch, Jamie S. ;
Barnaby, Douglas ;
Zanos, Theodoros P. .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (02)
[4]   A NEW METHOD OF CLASSIFYING PROGNOSTIC CO-MORBIDITY IN LONGITUDINAL-STUDIES - DEVELOPMENT AND VALIDATION [J].
CHARLSON, ME ;
POMPEI, P ;
ALES, KL ;
MACKENZIE, CR .
JOURNAL OF CHRONIC DISEASES, 1987, 40 (05) :373-383
[5]   Recurrent Neural Networks for Multivariate Time Series with Missing Values [J].
Che, Zhengping ;
Purushotham, Sanjay ;
Cho, Kyunghyun ;
Sontag, David ;
Liu, Yan .
SCIENTIFIC REPORTS, 2018, 8
[6]  
Chinazzi M, 2020, SCIENCE, V368, P395, DOI [10.1126/science.aba9757, 10.1101/2020.02.09.20021261]
[7]   Sensitivity analysis for clinical trials with missing continuous outcome data using controlled multiple imputation: A practical guide [J].
Cro, Suzie ;
Morris, Tim P. ;
Kenward, Michael G. ;
Carpenter, James R. .
STATISTICS IN MEDICINE, 2020, 39 (21) :2815-2842
[8]  
Erickson N, 7 ICML WORKSH AUT MA
[9]   Machine learning based early warning system enables accurate mortality risk prediction for COVID-19 [J].
Gao, Yue ;
Cai, Guang-Yao ;
Fang, Wei ;
Li, Hua-Yi ;
Wang, Si-Yuan ;
Chen, Lingxi ;
Yu, Yang ;
Liu, Dan ;
Xu, Sen ;
Cui, Peng-Fei ;
Zeng, Shao-Qing ;
Feng, Xin-Xia ;
Yu, Rui-Di ;
Wang, Ya ;
Yuan, Yuan ;
Jiao, Xiao-Fei ;
Chi, Jian-Hua ;
Liu, Jia-Hao ;
Li, Ru-Yuan ;
Zheng, Xu ;
Song, Chun-Yan ;
Jin, Ning ;
Gong, Wen-Jian ;
Liu, Xing-Yu ;
Huang, Lei ;
Tian, Xun ;
Li, Lin ;
Xing, Hui ;
Ma, Ding ;
Li, Chun-Rui ;
Ye, Fei ;
Gao, Qing-Lei .
NATURE COMMUNICATIONS, 2020, 11 (01)
[10]   A Tool for Early Prediction of Severe Coronavirus Disease 2019 (COVID-19): A Multicenter Study Using the Risk Nomogram in Wuhan and Guangdong, China [J].
Gong, Jiao ;
Ou, Jingyi ;
Qiu, Xueping ;
Jie, Yusheng ;
Chen, Yaqiong ;
Yuan, Lianxiong ;
Cao, Jing ;
Tan, Mingkai ;
Xu, Wenxiong ;
Zheng, Fang ;
Shi, Yaling ;
Hu, Bo .
CLINICAL INFECTIOUS DISEASES, 2020, 71 (15) :833-840