Predicting social response to infectious disease outbreaks from internet-based news streams

被引:15
作者
Fast, Shannon M. [1 ]
Kim, Louis [1 ]
Cohn, Emily L. [2 ]
Mekaru, Sumiko R. [2 ]
Brownstein, John S. [2 ]
Markuzon, Natasha [1 ]
机构
[1] Charles Stark Draper Lab, Informat & Decis Syst Div, Cambridge, MA 02139 USA
[2] Harvard Med Sch, Boston Childrens Hosp, Boston, MA USA
关键词
Biosurveillance; Social response; Epidemics; Anomaly detection; Near real-time prediction; PUBLIC-HEALTH;
D O I
10.1007/s10479-017-2480-9
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Infectious disease outbreaks often have consequences beyond human health, including concern among the population, economic instability, and sometimes violence. A warning system capable of anticipating social disruptions resulting from disease outbreaks is urgently needed to help decision makers prepare appropriately. We designed a system that operates in near real-time to identify and predict social response. Over 150,000 Internet-based news articles related to outbreaks of 16 diseases in 72 countries and territories were provided by HealthMap. These articles were automatically tagged with indicators of the disease activity and population reaction. An anomaly detection algorithm was implemented on the population reaction indicators to identify periods of unusually severe social response. Then a model was developed to predict the probability of these periods of unusually severe social response occurring in the coming week, 2 and 3 weeks. This model exhibited remarkably strong performance for diseases with substantial media coverage. For country-disease pairs with a median of 20 or more articles per year, the onset of social response in the next week was correctly predicted over 60% of the time, and 87% of weeks were correctly predicted. Performance was weaker for diseases with little media coverage, and, for these diseases, the main utility of our system is in identifying social response when it occurs, rather than predicting when it will happen in the future. Overall, the developed near real-time prediction approach is a promising step toward developing predictive models to inform responders of the likely social consequences of disease spread.
引用
收藏
页码:551 / 564
页数:14
相关论文
共 38 条
[1]  
Batista GE., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735]
[2]   Taking time seriously: Time-series-cross-section analysis with a binary dependent variable [J].
Beck, N ;
Katz, JN ;
Tucker, R .
AMERICAN JOURNAL OF POLITICAL SCIENCE, 1998, 42 (04) :1260-1288
[3]  
Beck N., 2001, ALTERNATIVE MODELS D
[4]   Twitter mood predicts the stock market [J].
Bollen, Johan ;
Mao, Huina ;
Zeng, Xiaojun .
JOURNAL OF COMPUTATIONAL SCIENCE, 2011, 2 (01) :1-8
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Surveillance sans frontieres: Internet-based emerging infectious disease intelligence and the HealthMap project [J].
Brownstein, John S. ;
Freifeld, Clark C. ;
Reis, Ben Y. ;
Mandl, Kenneth D. .
PLOS MEDICINE, 2008, 5 (07) :1019-1024
[7]   Algorithms for rapid outbreak detection: a research synthesis [J].
Buckeridge, DL ;
Burkom, H ;
Campbell, M ;
Hogan, WR ;
Moore, AW .
JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (02) :99-113
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]  
Cheng Cecilia, 2004, Asian Perspective, V28, P67
[10]   BioCaster: detecting public health rumors with a Web-based text mining system [J].
Collier, Nigel ;
Doan, Son ;
Kawazoe, Ai ;
Goodwin, Reiko Matsuda ;
Conway, Mike ;
Tateno, Yoshio ;
Quoc-Hung Ngo ;
Dinh Dien ;
Kawtrakul, Asanee ;
Takeuchi, Koichi ;
Shigematsu, Mika ;
Taniguchi, Kiyosu .
BIOINFORMATICS, 2008, 24 (24) :2940-2941