Modeling COVID-19 incidence with Google Trends

被引:12
作者
Amusa, Lateef Babatunde [1 ]
Twinomurinzi, Hossana [1 ]
Okonkwo, Chinedu Wilfred [1 ]
机构
[1] Univ Johannesburg, Coll Business & Econ, Ctr Appl Data Sci, Johannesburg, South Africa
关键词
Big Data; Google Trends; ARIMA; COVID-19; infectious disease modeling; TIME-SERIES;
D O I
10.3389/frma.2022.1003972
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Infodemiologic methods could be used to enhance modeling infectious diseases. It is of interest to verify the utility of these methods using a Nigerian case study. We used Google Trends data to track COVID-19 incidences and assessed whether they could complement traditional data based solely on reported case numbers. Data on the Nigerian weekly COVID-19 cases spanning through March 1, 2020, to May 31, 2021, were matched with internet search data from Google Trends. The reported weekly incidence numbers and the GT data were split into training and testing sets. ARIMA models were fitted to describe reported weekly COVID cases using the training set. Several COVID-related search terms were theoretically and empirically assessed for initial screening. The utilized Google Trends (GT) variable was added to the ARIMA model as a regressor. Model forecasts, both with and without GTD, were compared with weekly cases in the test set over 13 weeks. Forecast accuracies were compared visually and using RMSE (root mean square error) and MAE (mean average error). Statistical significance of the difference in predictions was determined with the two-sided Diebold-Mariano test. Preliminary results of contemporaneous correlations between COVID-related search terms and weekly COVID cases reveal "loss of smell," "loss of taste," "fever" (in order of magnitude) as significantly associated with the official cases. Predictions of the ARIMA model using solely reported case numbers resulted in an RMSE (root mean squared error) of 411.4 and mean absolute error (MAE) of 354.9. The GT expanded model achieved better forecasting accuracy (RMSE: 388.7 and MAE = 340.1). Corrected Akaike Information Criteria also favored the GT expanded model (869.4 vs. 872.2). The difference in predictive performances was significant when using a two-sided Diebold-Mariano test (DM = 6.75, p < 0.001) for the 13 weeks. Google trends data enhanced the predictive ability of a traditionally based model and should be considered a suitable method to enhance infectious disease modeling.
引用
收藏
页数:8
相关论文
共 41 条
[1]  
Allard R, 1998, B WORLD HEALTH ORGAN, V76, P327
[2]  
[Anonymous], 2020, R LANG ENV STAT COMP
[3]   Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study [J].
Ayyoubzadeh, Seyed Mohammad ;
Ayyoubzadeh, Seyed Mehdi ;
Zahedi, Hoda ;
Ahmadi, Mahnaz ;
Kalhori, Sharareh R. Niakan .
JMIR PUBLIC HEALTH AND SURVEILLANCE, 2020, 6 (02) :192-198
[4]   Google Trends: A Web-Based Tool for Real-Time Surveillance of Disease Outbreaks [J].
Carneiro, Herman Anthony ;
Mylonakis, Eleftherios .
CLINICAL INFECTIOUS DISEASES, 2009, 49 (10) :1557-1564
[5]   Loss of smell and taste: a new marker of COVID-19? Tracking reduced sense of smell during the coronavirus pandemic using search trends [J].
Cherry, George ;
Rocke, John ;
Chu, Michael ;
Liu, Jacklyn ;
Lechner, Matt ;
Lund, Valerie J. ;
Kumar, B. Nirmal .
EXPERT REVIEW OF ANTI-INFECTIVE THERAPY, 2020, 18 (11) :1165-1170
[6]   Comparing predictive accuracy (Reprinted) [J].
Diebold, FX ;
Mariano, RS .
JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2002, 20 (01) :134-144
[7]   An interactive web-based dashboard to track COVID-19 in real time [J].
Dong, Ensheng ;
Du, Hongru ;
Gardner, Lauren .
LANCET INFECTIOUS DISEASES, 2020, 20 (05) :533-534
[8]   Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet [J].
Eysenbach, Gunther .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2009, 11 (01)
[9]   Associations of Topics of Discussion on Twitter With Survey Measures of Attitudes, Knowledge, and Behaviors Related to Zika: Probabilistic Study in the United States [J].
Farhadloo, Mohsen ;
Winneg, Kenneth ;
Chan, Man-Pui Sally ;
Jamieson, Kathleen Hall ;
Albarracin, Dolores .
JMIR PUBLIC HEALTH AND SURVEILLANCE, 2018, 4 (01) :57-67
[10]  
Fulk A, 2022, medRxiv, DOI [10.1101/2021.03.26.21254369, 10.1101/2021.03.26.21254369, DOI 10.1101/2021.03.26.21254369]