Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study

被引:62
作者
Fang, Zheng-gang [1 ]
Yang, Shu-qin [1 ]
Lv, Cai-xia [1 ]
An, Shu-yi [2 ]
Wu, Wei [1 ]
机构
[1] China Med Univ, Dept Epidemiol, Shenyang, Peoples R China
[2] Liaoning Prov Ctr Dis Control & Prevent, Dept Social Med & Hlth, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
COVID-19; epidemiology;
D O I
10.1136/bmjopen-2021-056685
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objective The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more accurate for anticipating the occurrence of COVID-19 in the USA. Design Time-series study. Setting The USA was the setting for this study. Main outcome measures Three accuracy metrics, mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE), were applied to evaluate the performance of the two models. Results In our study, for the training set and the validation set, the MAE, RMSE and MAPE of the XGBoost model were less than those of the ARIMA model. Conclusions The XGBoost model can help improve prediction of COVID-19 cases in the USA over the ARIMA model.
引用
收藏
页数:8
相关论文
共 50 条
[31]   COVID-19 Outbreak: An Epidemic Analysis using Time Series Prediction Model [J].
Kumar, Raghavendra ;
Jain, Anjali ;
Tripathi, Arun Kumar ;
Tyagi, Shaifali .
2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, :1090-1094
[32]   Chaos of COVID-19 Superspreading Events: An Analysis Via a Data-driven Approach [J].
Ganegoda, N. C. ;
Perera, S. S. N. .
JOURNAL OF HEALTH MANAGEMENT, 2023, 25 (03) :514-525
[33]   Temporal Dynamics of COVID-19 Outbreak and Future Projections: A Data-Driven Approach [J].
Rajesh Ranjan .
Transactions of the Indian National Academy of Engineering, 2020, 5 (2) :109-115
[34]   Simulation of the COVID-19 patient flow and investigation of the future patient arrival using a time-series prediction model: a real-case study [J].
Tavakoli, Mahdieh ;
Tavakkoli-Moghaddam, Reza ;
Mesbahi, Reza ;
Ghanavati-Nejad, Mohssen ;
Tajally, Amirreza .
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2022, 60 (04) :969-990
[35]   An evaluation of COVID-19 in Italy: A data-driven modeling analysis [J].
Ding, Yongmei ;
Gao, Liyuan .
INFECTIOUS DISEASE MODELLING, 2020, 5 :495-501
[36]   Weather Conditions and COVID-19 Incidence in a Cold Climate: A Time-Series Study in Finland [J].
Heibati, Behzad ;
Wang, Wenge ;
Ryti, Niilo R., I ;
Dominici, Francesca ;
Ducatman, Alan ;
Zhang, Zhijie ;
Jaakkola, Jouni J. K. .
FRONTIERS IN PUBLIC HEALTH, 2021, 8
[37]   The impact of COVID-19 on breast cancer mortality trends in Brazil: A time-series study [J].
Hyeda, Adriano ;
da Costa, Elide Sbardellotto Mariano ;
Kowalski, Sergio Candido .
ANNALS OF EPIDEMIOLOGY, 2025, 101 :7-13
[38]   A data-driven network model for the emerging COVID-19 epidemics in Wuhan, Toronto and Italy [J].
Xue, Ling ;
Jing, Shuanglin ;
Miller, Joel C. ;
Sun, Wei ;
Li, Huafeng ;
Guillermo Estrada-Franco, Jose ;
Hyman, James M. ;
Zhu, Huaiping .
MATHEMATICAL BIOSCIENCES, 2020, 326 (326)
[39]   Data-Driven Solutions in Smart Cities: The case of Covid-19 [J].
Petrovic, Nenad N. ;
Dimovski, Vlado ;
Peterlin, Judita ;
Mesko, Maja ;
Roblek, Vasja .
WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, :648-656
[40]   DATA-DRIVEN COVID-19 DETECTION THROUGH MEDICAL IMAGING [J].
Arsenos, Anastasios ;
Davidhi, Andjoli ;
Kollias, Dimitrios ;
Prassopoulos, Panos ;
Kollias, Stefanos .
2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,