Comparing different machine learning techniques for predicting COVID-19 severity

被引:42
作者
Xiong, Yibai [1 ]
Ma, Yan [1 ]
Ruan, Lianguo [2 ]
Li, Dan [3 ]
Lu, Cheng [1 ]
Huang, Luqi [4 ]
机构
[1] China Acad Chinese Med Sci, Inst Basic Res Clin Med, 16 Nanxiao St, Beijing 100700, Peoples R China
[2] JinYinTan Hosp, Dept Infect Dis, Wuhan 430040, Peoples R China
[3] Chinese Ctr Dis Control & Prevent, Informat Ctr, Beijing 102206, Peoples R China
[4] China Acad Chinese Med Sci, Natl Resource Ctr Chinese Mat Med, 16 Nanxiao St, Beijing 100700, Peoples R China
关键词
COVID-19; Severity; Machine learning; Support vector machine; Random Forest; Logistic regression; RISK-FACTORS; METAANALYSIS; OUTCOMES; CHINA;
D O I
10.1186/s40249-022-00946-4
中图分类号
R51 [传染病];
学科分类号
100401 ;
摘要
Background: Coronavirus disease 2019 (COVID-19) is still ongoing spreading globally, machine learning techniques were used in disease diagnosis and to predict treatment outcomes, which showed favorable performance. The present study aims to predict COVID-19 severity at admission by different machine learning techniques including random forest (RF), support vector machine (SVM), and logistic regression (LR). Feature importance to COVID-19 severity were further identified. Methods: A retrospective design was adopted in the JinYinTan Hospital from January 26 to March 28, 2020, eightysix demographic, clinical, and laboratory features were selected with LassoCV method, Spearman's rank correlation, experts' opinions, and literature evaluation. RF, SVM, and LR were performed to predict severe COVID-19, the performance of the models was compared by the area under curve (AUC). Additionally, feature importance to COVID-19 severity were analyzed by the best performance model. Results: A total of 287 patients were enrolled with 36.6% severe cases and 63.4% non-severe cases. The median age was 60.0 years (interquartile range: 49.0-68.0 years). Three models were established using 23 features including 1 clinical, 1 chest computed tomography (CT) and 21 laboratory features. Among three models, RF yielded better overall performance with the highest AUC of 0.970 than SVM of 0.948 and LR of 0.928, RF also achieved a favorable sensitivity of 96.7%, specificity of 69.5%, and accuracy of 84.5%. SVM had sensitivity of 93.9%, specificity of 79.0%, and accuracy of 88.5%. LR also achieved a favorable sensitivity of 92.3%, specificity of 72.3%, and accuracy of 85.2%. Additionally, chest-CT had highest importance to illness severity, and the following features were neutrophil to lymphocyte ratio, lactate dehydrogenase, and D-dimer, respectively. Conclusions: Our results indicated that RF could be a useful predictive tool to identify patients with severe COVID-19, which may facilitate effective care and further optimize resources.
引用
收藏
页数:9
相关论文
共 41 条
[1]   Leukocyte extravasation:: An immunoregulatory role for α-L-Fucosidase? [J].
Ali, Simi ;
Jenkins, Yvonne ;
Kirkley, Maureen ;
Dagkalis, Athanasios ;
Manivannan, Ayyakkannu ;
Crane, Isabel Joan ;
Kirby, John A. .
JOURNAL OF IMMUNOLOGY, 2008, 181 (04) :2407-2413
[2]   Factors Defining the Development of Severe Illness in Patients with COVID-19: A Retrospective Study [J].
Bai, Xiong Yi ;
Xin, Tian Ya ;
Yan, Ma ;
Wei, Yang ;
Bin, Liu ;
Guo, Ruan Lian ;
Cheng, Lu ;
Qi, Huang Lu .
BIOMEDICAL AND ENVIRONMENTAL SCIENCES, 2021, 34 (12) :984-+
[3]   Artificial intelligence-driven assessment of radiological images for COVID-19 [J].
Bouchareb, Yassine ;
Khaniabadi, Pegah Moradi ;
Al Kindi, Faiza ;
Al Dhuhli, Humoud ;
Shiri, Isaac ;
Zaidi, Habib ;
Rahmim, Arman .
COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 136
[4]   Radiomic and Genomic Machine Learning Method Performance for Prostate Cancer Diagnosis: Systematic Literature Review [J].
Castaldo, Rossana ;
Cavaliere, Carlo ;
Soricelli, Andrea ;
Salvatore, Marco ;
Pecchia, Leandro ;
Franzese, Monica .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (04)
[5]   A Multimodality Machine Learning Approach to Differentiate Severe and Nonsevere COVID-19: Model Development and Validation [J].
Chen, Yuanfang ;
Ouyang, Liu ;
Bao, Forrest S. ;
Li, Qian ;
Han, Lei ;
Zhang, Hengdong ;
Zhu, Baoli ;
Ge, Yaorong ;
Robinson, Patrick ;
Xu, Ming ;
Liu, Jie ;
Chen, Shi .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (04)
[6]   Use and performance of machine learning models for type 2 diabetes prediction in community settings: A systematic review and meta-analysis [J].
De Silva, Kushan ;
Lee, Wai Kit ;
Forbes, Andrew ;
Demmer, Ryan T. ;
Barton, Christopher ;
Enticott, Joanne .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2020, 143
[7]   Risk Factors for Mortality Due to Ventilator-Associated Pneumonia in a Chinese Hospital: A Retrospective Study [J].
Feng, Ding-Yun ;
Zhou, Yu-Qi ;
Zhou, Mi ;
Zou, Xiao-Ling ;
Wang, Yan-Hong ;
Zhang, Tian-Tuo .
MEDICAL SCIENCE MONITOR, 2019, 25 :7660-7665
[8]   Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy [J].
Fleuren, Lucas M. ;
Klausch, Thomas L. T. ;
Zwager, Charlotte L. ;
Schoonmade, Linda J. ;
Guo, Tingjie ;
Roggeveen, Luca F. ;
Swart, Eleonora L. ;
Girbes, Armand R. J. ;
Thoral, Patrick ;
Ercole, Ari ;
Hoogendoorn, Mark ;
Elbers, Paul W. G. .
INTENSIVE CARE MEDICINE, 2020, 46 (03) :383-400
[9]   Machine learning based early warning system enables accurate mortality risk prediction for COVID-19 [J].
Gao, Yue ;
Cai, Guang-Yao ;
Fang, Wei ;
Li, Hua-Yi ;
Wang, Si-Yuan ;
Chen, Lingxi ;
Yu, Yang ;
Liu, Dan ;
Xu, Sen ;
Cui, Peng-Fei ;
Zeng, Shao-Qing ;
Feng, Xin-Xia ;
Yu, Rui-Di ;
Wang, Ya ;
Yuan, Yuan ;
Jiao, Xiao-Fei ;
Chi, Jian-Hua ;
Liu, Jia-Hao ;
Li, Ru-Yuan ;
Zheng, Xu ;
Song, Chun-Yan ;
Jin, Ning ;
Gong, Wen-Jian ;
Liu, Xing-Yu ;
Huang, Lei ;
Tian, Xun ;
Li, Lin ;
Xing, Hui ;
Ma, Ding ;
Li, Chun-Rui ;
Ye, Fei ;
Gao, Qing-Lei .
NATURE COMMUNICATIONS, 2020, 11 (01)
[10]   COVID-19 Does Not Lead to a "Typical" Acute Respiratory Distress Syndrome [J].
Gattinoni, Luciano ;
Coppola, Silvia ;
Cressoni, Massimo ;
Busana, Mattia ;
Rossi, Sandra ;
Chiumello, Davide .
AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2020, 201 (10) :1299-1300