Tweetluenza: Predicting Flu Trends from Twitter Data

被引:31
作者
Alkouz, Balsam [1 ]
Al Aghbari, Zaher [1 ]
Abawajy, Jemal Hussien [2 ]
机构
[1] Univ Sharjah, Dept Comp Sci, Sharjah 27272, U Arab Emirates
[2] Deakin Univ, Dept Sci Engn & Built Environm, Melbourne, Vic 3125, Australia
关键词
Twitter data analysis; Influenza forecasting; prediction using social media; social media mining;
D O I
10.26599/BDMA.2019.9020012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Health authorities worldwide strive to detect Influenza prevalence as early as possible in order to prepare for it and minimize its impacts. To this end, we address the Influenza prevalence surveillance and prediction problem. In this paper, we develop a new Influenza prevalence prediction model, called Tweetluenza, to predict the spread of the Influenza in real time using cross-lingual data harvested from Twitter data streams with emphases on the United Arab Emirates (UAE). Based on the features of tweets, Tweetluenza filters the Influenza tweets and classifies them into two classes, reporting and non-reporting. To monitor the growth of Influenza, the reporting tweets were employed. Furthermore, a linear regression model leverages the reporting tweets to predict the Influenza-related hospital visits in the future. We evaluated Tweetluenza empirically to study its feasibility and compared the results with the actual hospital visits recorded by the UAE Ministry of Health. The results of our experiments demonstrate the practicality of Tweetluenza, which was verified by the high correlation between the Influenza-related Twitter data and hospital visits due to Influenza. Furthermore, the evaluation of the analysis and prediction of Influenza shows that combining English and Arabic tweets improves the correlation results.
引用
收藏
页码:273 / 287
页数:15
相关论文
共 39 条
  • [1] Achrekar Harshavardhan, 2012, Proceedings of the International Conference on Health Informatics. HEALTHINF 2012, P61
  • [2] Achrekar H., 2011, IEEE INFOCOM 2011 - IEEE Conference on Computer Communications. Workshops, P702, DOI 10.1109/INFCOMW.2011.5928903
  • [3] Achrekar H., 2012, PROC 5 INT JOINT C B, P353
  • [4] Wellness Representation of Users in Social Media: Towards Joint Modelling of Heterogeneity and Temporality
    Akbari, Mohammad
    Hu, Xia
    Wang, Fei
    Chua, Tat-Seng
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (10) : 2360 - 2373
  • [5] Using online social networks to track a pandemic: A systematic review
    Al-garadi, Mohammed Ali
    Khan, Muhammad Sadiq
    Varathan, Kasturi Dewi
    Mujtaba, Ghulam
    Al-Kabsi, Abdelkodose M.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 62 : 1 - 11
  • [6] Alkouz Balsam, 2018, 2018 IEEE 3rd International Conference on Big Data Analysis (ICBDA), P61, DOI 10.1109/ICBDA.2018.8367652
  • [7] [Anonymous], 2014, INT J ADV RES COMPUT
  • [8] [Anonymous], 2014, IMAGE PROCESSING COM, DOI DOI 10.1007/978-3-319-01622
  • [9] Aramaki E., 2011, P C EMP METH NAT LAN, P1568
  • [10] The Reliability of Tweets as a Supplementary Method of Seasonal Influenza Surveillance
    Aslam, Anoshe A.
    Tsou, Ming-Hsiang
    Spitzberg, Brian H.
    An, Li
    Gawron, J. Mark
    Gupta, Dipak K.
    Peddecord, K. Michael
    Nagel, Anna C.
    Allen, Christopher
    Yang, Jiue-An
    Lindsay, Suzanne
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2014, 16 (11)