Application of Machine Learning to Predict COVID-19 Spread via an Optimized BPSO Model

被引:40
作者
Alkhammash, Eman H. [1 ]
Assiri, Sara Ahmad [2 ]
Nemenqani, Dalal M. [3 ]
Althaqafi, Raad M. M. [3 ]
Hadjouni, Myriam [4 ]
Saeed, Faisal [5 ]
Elshewey, Ahmed M. [6 ]
机构
[1] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, POB 11099, Taif 21944, Saudi Arabia
[2] King Faisal Hosp, Otolaryngol Head & Neck Surgert Dept, POB 11099, Taif 21944, Saudi Arabia
[3] Taif Univ, Coll Med, POB 11099, Taif 21944, Saudi Arabia
[4] Princess Nourah bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 84428, Riyadh 11671, Saudi Arabia
[5] Birmingham City Univ, Sch Comp & Digital Technol, Dept Comp & Data Sci, DAAI Res Grp, Birmingham B4 7XG, England
[6] Suez Univ, Fac Comp & Informat, Comp Sci Dept, Suez 43533, Egypt
关键词
k-nearest neighbor; binary particle swarm optimization; random oversampling; random forest model; gradient boosting model; naive Bayes model;
D O I
10.3390/biomimetics8060457
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
During the pandemic of the coronavirus disease (COVID-19), statistics showed that the number of affected cases differed from one country to another and also from one city to another. Therefore, in this paper, we provide an enhanced model for predicting COVID-19 samples in different regions of Saudi Arabia (high-altitude and sea-level areas). The model is developed using several stages and was successfully trained and tested using two datasets that were collected from Taif city (high-altitude area) and Jeddah city (sea-level area) in Saudi Arabia. Binary particle swarm optimization (BPSO) is used in this study for making feature selections using three different machine learning models, i.e., the random forest model, gradient boosting model, and naive Bayes model. A number of predicting evaluation metrics including accuracy, training score, testing score, F-measure, recall, precision, and receiver operating characteristic (ROC) curve were calculated to verify the performance of the three machine learning models on these datasets. The experimental results demonstrated that the gradient boosting model gives better results than the random forest and naive Bayes models with an accuracy of 94.6% using the Taif city dataset. For the dataset of Jeddah city, the results demonstrated that the random forest model outperforms the gradient boosting and naive Bayes models with an accuracy of 95.5%. The dataset of Jeddah city achieved better results than the dataset of Taif city in Saudi Arabia using the enhanced model for the term of accuracy.
引用
收藏
页数:19
相关论文
共 50 条
[1]   Clinical and Laboratory Findings of COVID-19 in High-Altitude Inhabitants of Saudi Arabia [J].
Abdelsalam, Mostafa ;
Althaqafi, Raad M. M. ;
Assiri, Sara A. ;
Althagafi, Taghreed M. ;
Althagafi, Saleh M. ;
Fouda, Ahmed Y. ;
Ramadan, Ahmed ;
Rabah, Mohammed ;
Ahmed, Reham M. ;
Ibrahim, Zein S. ;
Nemenqani, Dalal M. ;
Alghamdi, Ahmed N. ;
Al Aboud, Daifullah ;
Abdel-Moneim, Ahmed S. ;
Alsulaimani, Adnan A. .
FRONTIERS IN MEDICINE, 2021, 8
[2]   Modeling the Spread of COVID-19 by Leveraging Machine and Deep Learning Models [J].
Adnan, Muhammad ;
Altalhi, Maryam ;
Alarood, Ala Abdulsalam ;
Uddin, M. Irfan .
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 31 (03) :1857-1872
[3]   A Hybrid Ensemble Stacking Model for Gender Voice Recognition Approach [J].
Alkhammash, Eman H. ;
Hadjouni, Myriam ;
Elshewey, Ahmed M. .
ELECTRONICS, 2022, 11 (11)
[4]   Does the pathogenesis of SARS-CoV-2 virus decrease at high-altitude? [J].
Arias-Reyes, Christian ;
Zubieta-DeUrioste, Natalia ;
Poma-Machicao, Liliana ;
Aliaga-Raudan, Fernanda ;
Carvajal-Rodriguez, Favio ;
Dutschmann, Mathias ;
Schneider-Gasser, Edithm ;
Zubieta-Calleja, Gustavo ;
Soliz, Jorge .
RESPIRATORY PHYSIOLOGY & NEUROBIOLOGY, 2020, 277
[5]   Machine-learning-based COVID-19 mortality prediction model and identification of patients at low and high risk of dying [J].
Banoei, Mohammad M. ;
Dinparastisaleh, Roshan ;
Zadeh, Ali Vaeli ;
Mirsaeidi, Mehdi .
CRITICAL CARE, 2021, 25 (01)
[6]  
Batista AFDM, 2020, medRxiv, DOI [10.1101/2020.04.04.20052092, 10.1101/2020.04.04.20052092, DOI 10.1101/2020.04.04.20052092]
[7]  
Berrar D, 2019, ENCY BIOINFORMATICS, P403, DOI [10.1016/B978-0-12-809633-8.20473-1, 10.1016/b978-0-12-809633-8.20473-1]
[8]  
Brennan P., 2012, A comprehensive survey of methods for overcoming the class imbalance problem in fraud detection
[9]   Letter to the Editor:COVID-19 Infections Do Not Change with Increasing Altitudes from 1,000 to 4,700 m [J].
Castagnetto, Jesus M. ;
Segovia-Juarez, Jose ;
Gonzales, Gustavo F. .
HIGH ALTITUDE MEDICINE & BIOLOGY, 2020, 21 (04) :428-430
[10]  
CDC, 2020, Novel Coronavirus Reports: Morbidity and Mortality Weekly Report (MMWR)