Development of decision tree classification algorithms in predicting mortality of COVID-19 patients

被引:4
作者
Mohammadi-Pirouz, Zahra [1 ]
Hajian-Tilaki, Karimollah [2 ,3 ]
Sadeghi Haddat-Zavareh, Mahmoud [4 ]
Amoozadeh, Abazar [3 ]
Bahrami, Shabnam [1 ]
机构
[1] Babol Univ Med Sci, Res Inst, Student Res Ctr, Babol, Iran
[2] Babol Univ Med Sci, Sch Publ Hlth, Dept Biostat & Epidemiol, Babol, Iran
[3] Babol Univ Med Sci, Res Inst, Social Determinants Hlth Res Ctr, Babol, Iran
[4] Babol Univ Med Sci, Ayatollah Rohani Hosp, Dept Infect Dis, Babol, Iran
关键词
Decision tree; CART; C5.0; CHAID; Logistic regression; COVID-19; mortality; Predictive factors; LOGISTIC-REGRESSION;
D O I
10.1186/s12245-024-00681-7
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
IntroductionThe accurate prediction of COVID-19 mortality risk, considering influencing factors, is crucial in guiding effective public policies to alleviate the strain on the healthcare system. As such, this study aimed to assess the efficacy of decision tree algorithms (CART, C5.0, and CHAID) in predicting COVID-19 mortality risk and compare their performance with that of the logistic model.MethodsThis retrospective cohort study examined 5080 cases of COVID-19 in Babol, a city in northern Iran, who tested positive for the virus via PCR from March 2020 to March 2022. In order to check the validity of the findings, the data was randomly divided into an 80% training set and a 20% testing set. The prediction models, such as Logistic regression models and decision tree algorithms, were trained on the 80% training data and tested on the 20% testing data. The accuracy of these methods for the test samples was assessed using measures like ROC curve, sensitivity, specificity, and AUC.ResultsThe findings revealed that the mortality rate for COVID-19 patients who were admitted to hospitals was 7.7%. Through cross validation, it was determined that the CHAID algorithm outperformed other decision tree and logistic regression algorithms in specificity, and precision but not sensitivity in predicting the risk of COVID-19 mortality. The CHAID algorithm demonstrated a specificity, precision, accuracy, and F-score of 0.98, 0.70, 0.95, and 0.52 respectively. All models indicated that factors such as ICU hospitalization, intubation, age, kidney disease, BUN, CRP, WBC, NLR, O2 sat, and hemoglobin were among the factors that influenced the mortality rate of COVID-19 patients.ConclusionsThe CART and C5.0 models had outperformed in sensitivity but CHAID demonstrates a better performance compared to other decision tree algorithms in specificity, precision, accuracy and shows a slight improvement over the logistic regression method in predicting the risk of COVID-19 mortality in the population under study.
引用
收藏
页数:18
相关论文
共 49 条
[41]   Clinical characteristics and predictors of mortality associated with COVID-19 in elderly patients from a long-term care facility [J].
Trecarichi, Enrico Maria ;
Mazzitelli, Maria ;
Serapide, Francesca ;
Pelle, Maria Chiara ;
Tassone, Bruno ;
Arrighi, Eugenio ;
Perri, Graziella ;
Fusco, Paolo ;
Scaglione, Vincenzo ;
Davoli, Chiara ;
Lionello, Rosaria ;
La Gamba, Valentina ;
Marrazzo, Giuseppina ;
Busceti, Maria Teresa ;
Giudice, Amerigo ;
Ricchio, Marco ;
Cancelliere, Anna ;
Lio, Elena ;
Procopio, Giada ;
Costanzo, Francesco Saverio ;
Foti, Daniela Patrizia ;
Matera, Giovanni ;
Torti, Carlo .
SCIENTIFIC REPORTS, 2020, 10 (01)
[42]   Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks [J].
Tso, Geoffrey K. F. ;
Yau, Kelvin K. W. .
ENERGY, 2007, 32 (09) :1761-1768
[43]   Multiple imputation of discrete and continuous data by fully conditional specification [J].
van Buuren, Stef .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2007, 16 (03) :219-242
[44]   Updated understanding of the outbreak of 2019 novel coronavirus (2019-nCoV) in Wuhan, China [J].
Wang, Weier ;
Tang, Jianming ;
Wei, Fangqiang .
JOURNAL OF MEDICAL VIROLOGY, 2020, 92 (04) :441-447
[45]   Machine Learning for Personalized Medicine: Predicting Primary Myocardial Infarction from Electronic Health Records [J].
Weiss, Jeremy C. ;
Natarajan, Sriraam ;
Peissig, Peggy L. ;
McCarty, Catherine A. ;
Page, David .
AI MAGAZINE, 2012, 33 (04) :33-45
[46]  
World Health Organization, 2020, CLIN MANAGEMENT SEVE
[47]   A decision tree model of cerebral palsy based on risk factors [J].
Xiang, Shiting ;
Li, Liping ;
Wang, Lili ;
Liu, Juan ;
Tan, Yaqiong ;
Hu, Jihong .
JOURNAL OF MATERNAL-FETAL & NEONATAL MEDICINE, 2021, 34 (23) :3922-3927
[48]   Comparison of decision tree methods for finding active objects [J].
Zhao, Yongheng ;
Zhang, Yanxia .
ADVANCES IN SPACE RESEARCH, 2008, 41 (12) :1955-1959
[49]   Risk factors of critical & mortal COVID-19 cases: A systematic literature review and meta-analysis [J].
Zheng, Zhaohai ;
Peng, Fang ;
Xu, Buyun ;
Zhao, Jingjing ;
Liu, Huahua ;
Peng, Jiahao ;
Li, Qingsong ;
Jiang, Chongfu ;
Zhou, Yan ;
Liu, Shuqing ;
Ye, Chunji ;
Zhang, Peng ;
Xing, Yangbo ;
Guo, Hangyuan ;
Tang, Weiliang .
JOURNAL OF INFECTION, 2020, 81 (02) :E16-E25