Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study

被引:50
作者
Fang, Zheng-gang [1 ]
Yang, Shu-qin [1 ]
Lv, Cai-xia [1 ]
An, Shu-yi [2 ]
Wu, Wei [1 ]
机构
[1] China Med Univ, Dept Epidemiol, Shenyang, Peoples R China
[2] Liaoning Prov Ctr Dis Control & Prevent, Dept Social Med & Hlth, Shenyang, Peoples R China
来源
BMJ OPEN | 2022年 / 12卷 / 07期
基金
中国国家自然科学基金;
关键词
COVID-19; epidemiology;
D O I
10.1136/bmjopen-2021-056685
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objective The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more accurate for anticipating the occurrence of COVID-19 in the USA. Design Time-series study. Setting The USA was the setting for this study. Main outcome measures Three accuracy metrics, mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE), were applied to evaluate the performance of the two models. Results In our study, for the training set and the validation set, the MAE, RMSE and MAPE of the XGBoost model were less than those of the ARIMA model. Conclusions The XGBoost model can help improve prediction of COVID-19 cases in the USA over the ARIMA model.
引用
收藏
页数:8
相关论文
共 36 条
[1]   A machine learning model to identify early stage symptoms of SARS-Cov-2 infected patients [J].
Ahamad, Md Martuza ;
Aktar, Sakifa ;
Rashed-Al-Mahfuz, Md ;
Uddin, Shahadat ;
Lio, Pietro ;
Xu, Haoming ;
Summers, Matthew A. ;
Quinn, Julian M. W. ;
Moni, Mohammad Ali .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160
[2]   Clinical effectiveness of COVID-19 vaccination in solid organ transplant recipients [J].
Aslam, Saima ;
Adler, Eric ;
Mekeel, Kristin ;
Little, Susan J. .
TRANSPLANT INFECTIOUS DISEASE, 2021, 23 (05)
[3]   COVID-19 mortality risk assessment: An international multi-center study [J].
Bertsimas, Dimitris ;
Lukin, Galit ;
Mingardi, Luca ;
Nohadani, Omid ;
Orfanoudaki, Agni ;
Stellato, Bartolomeo ;
Wiberg, Holly ;
Gonzalez-Garcia, Sara ;
Parra-Calderon, Carlos Luis ;
Robinson, Kenneth ;
Schneider, Michelle ;
Stein, Barry ;
Estirado, Alberto ;
Beccara, Lia ;
Canino, Rosario ;
Dal Bello, Martina ;
Pezzetti, Federica ;
Pan, Angelo .
PLOS ONE, 2020, 15 (12)
[4]  
CDC, 2021, DAT MOD IN
[5]   Geographic Differences in COVID-19 Cases, Deaths, and Incidence - United States, February 12-April 7, 2020 [J].
MMWR-MORBIDITY AND MORTALITY WEEKLY REPORT, 2020, 69 (15) :465-471
[6]  
Centers for Disease Control and Prevention, 2021, COVID data tracker
[7]   Estimation of COVID-19 prevalence in Italy, Spain, and France [J].
Ceylan, Zeynep .
SCIENCE OF THE TOTAL ENVIRONMENT, 2020, 729
[8]   Forecasting fully vaccinated people against COVID-19 and examining future vaccination rate for herd immunity in the US, Asia, Europe, Africa, South America, and the World [J].
Cihan, Pinar .
APPLIED SOFT COMPUTING, 2021, 111
[9]   Explorations in statistics: the log transformation [J].
Curran-Everett, Douglas .
ADVANCES IN PHYSIOLOGY EDUCATION, 2018, 42 (02) :343-347
[10]   First Identified Cases of SARS-CoV-2 Variant P.1 in the United States-Minnesota, January 2021 [J].
Firestone, Melanie J. ;
Lorentz, Alexandra J. ;
Meyer, Stephanie ;
Wang, Xiong ;
Como-Sabetti, Kathryn ;
Vetter, Sara ;
Smith, Kirk ;
Holzbauer, Stacy ;
Beaudoin, Amanda ;
Garfin, Jacob ;
Ehresmann, Kristin ;
Danila, Richard ;
Lynfield, Ruth .
MMWR-MORBIDITY AND MORTALITY WEEKLY REPORT, 2021, 70 (10) :346-347