COVID-19 Patient Health Prediction Using Boosted Random Forest Algorithm

被引:284
作者
Iwendi, Celestine [1 ]
Bashir, Ali Kashif [2 ]
Peshkar, Atharva [3 ]
Sujatha, R. [4 ]
Chatterjee, Jyotir Moy [5 ]
Pasupuleti, Swetha [6 ]
Mishra, Rishita [7 ]
Pillai, Sofia [6 ]
Jo, Ohyun [8 ]
机构
[1] BCC Cent South Univ Forestry & Technol, Changsha, Peoples R China
[2] Manchester Metropolitan Univ, Dept Comp & Math, Manchester, Lancs, England
[3] GH Raisoni Coll Engn, Dept Informat Technol, Nagpur, Maharashtra, India
[4] VIT Univ, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[5] Lord Buddha Educ Fdn, Dept Informat Technol, Kathmandu, Nepal
[6] Galgotias Univ, Sch Civil Engn, Greater Noida, India
[7] GH Raisoni Coll Engn, Dept Elect & Telecommun Engn, Nagpur, Maharashtra, India
[8] Chungbuk Natl Univ, Coll Elect & Comp Engn, Dept Comp Sci, Cheongju, South Korea
基金
新加坡国家研究基金会;
关键词
COVID-19; healthcare analytics; patient data; infection; boosting; random forest classification;
D O I
10.3389/fpubh.2020.00357
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Integration of artificial intelligence (AI) techniques in wireless infrastructure, real-time collection, and processing of end-user devices is now in high demand. It is now superlative to use AI to detect and predict pandemics of a colossal nature. The Coronavirus disease 2019 (COVID-19) pandemic, which originated in Wuhan China, has had disastrous effects on the global community and has overburdened advanced healthcare systems throughout the world. Globally; over 4,063,525 confirmed cases and 282,244 deaths have been recorded as of 11th May 2020, according to the European Centre for Disease Prevention and Control agency. However, the current rapid and exponential rise in the number of patients has necessitated efficient and quick prediction of the possible outcome of an infected patient for appropriate treatment using AI techniques. This paper proposes a fine-tuned Random Forest model boosted by the AdaBoost algorithm. The model uses the COVID-19 patient's geographical, travel, health, and demographic data to predict the severity of the case and the possible outcome, recovery, or death. The model has an accuracy of 94% and a F1 Score of 0.86 on the dataset used. The data analysis reveals a positive correlation between patients' gender and deaths, and also indicates that the majority of patients are aged between 20 and 70 years.
引用
收藏
页数:9
相关论文
共 29 条
[1]  
[Anonymous], 2020, NOVEL CORONA VIRUS 2
[2]  
[Anonymous], 2020, INF IS SUSP INT GUID
[3]   Presumed Asymptomatic Carrier Transmission of COVID-19 [J].
Bai, Yan ;
Yao, Lingsheng ;
Wei, Tao ;
Tian, Fei ;
Jin, Dong-Yan ;
Chen, Lijuan ;
Wang, Meiyun .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2020, 323 (14) :1406-1407
[4]  
Bayes C, 2020, MODELLING DEATH RATE
[5]  
Beck BR, 2020, PREDICTING COMMERCIA, DOI [10.1101/2020.01.31.929547v1.abstract, DOI 10.1101/2020.01.31.929547V1]
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Sex difference and smoking predisposition in patients with COVID-19 [J].
Cai, Hua .
LANCET RESPIRATORY MEDICINE, 2020, 8 (04) :E20-E20
[8]  
Chatterjee J.M., 2018, Global Journal of Internet Interventions and IT Fusion, V1, P28
[9]   Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study [J].
Chen, Nanshan ;
Zhou, Min ;
Dong, Xuan ;
Qu, Jieming ;
Gong, Fengyun ;
Han, Yang ;
Qiu, Yang ;
Wang, Jingli ;
Liu, Ying ;
Wei, Yuan ;
Xia, Jia'an ;
Yu, Ting ;
Zhang, Xinxin ;
Zhang, Li .
LANCET, 2020, 395 (10223) :507-513
[10]   Optimal Haptic Communications Over Nanonetworks for E-Health Systems [J].
Feng, Li ;
Ali, Amjad ;
Iqbal, Muddesar ;
Bashir, Ali Kashif ;
Hussain, Syed Asad ;
Pack, Sangheon .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (05) :3016-3027