Optimal multi-source forecasting of seasonal influenza

被引:23
作者
Ertem, Zeynep [1 ]
Raymond, Dorrie [2 ]
Meyers, Lauren Ancel [3 ,4 ,5 ,6 ]
机构
[1] Univ Texas Austin, Dept Stat & Data Sci, Austin, TX 78712 USA
[2] AthenaRes, Watertown, MA USA
[3] Univ Texas Austin, Dept Integrat Biol, Austin, TX 78712 USA
[4] Univ Texas Austin, Dept Stat, Austin, TX 78712 USA
[5] Univ Texas Austin, Dept Data Sci, Austin, TX 78712 USA
[6] Santa Fe Inst, Santa Fe, NM 87501 USA
关键词
EPIDEMIOLOGIC RESEARCH; PUBLIC-HEALTH; DISEASE; BURDEN; WEB; SURVEILLANCE; IMPACT;
D O I
10.1371/journal.pcbi.1006236
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Forecasting the emergence and spread of influenza viruses is an important public health challenge. Timely and accurate estimates of influenza prevalence, particularly of severe cases requiring hospitalization, can improve control measures to reduce transmission and mortality. Here, we extend a previously published machine learning method for influenza forecasting to integrate multiple diverse data sources, including traditional surveillance data, electronic health records, internet search traffic, and social media activity. Our hierarchical framework uses multi-linear regression to combine forecasts from multiple data sources and greedy optimization with forward selection to sequentially choose the most predictive combinations of data sources. We show that the systematic integration of complementary data sources can substantially improve forecast accuracy over single data sources. When forecasting the Center for Disease Control and Prevention (CDC) influenza-like-illness reports (ILINet) from week 48 through week 20, the optimal combination of predictors includes public health surveillance data and commercially available electronic medical records, but neither search engine nor social media data.
引用
收藏
页数:16
相关论文
共 50 条
[1]  
[Anonymous], NAACL, DOI DOI 10.1126/science.1248506
[2]  
[Anonymous], WHO FLUVIEW
[3]  
[Anonymous], TEXAS PANDEMIC FLU T, DOI DOI 10.1371/journal.pcbi.1003256
[4]  
[Anonymous], PAN AM J PUBLIC HLTH
[5]  
[Anonymous], CDC FLUV
[6]  
[Anonymous], 2009, NATURE, DOI DOI 10.1038/nature07634
[7]  
[Anonymous], ANN STAT
[8]  
[Anonymous], CDC FLUVIEW
[9]  
[Anonymous], DTRA BIOS EC
[10]  
[Anonymous], PLOS COMPUTATIONAL B, DOI DOI 10.2196/PUBLICHEALTH.5901