Disease outbreak prediction using natural language processing: a review

被引:0
作者
Gautam, Avneet Singh [1 ]
Raza, Zahid [1 ]
机构
[1] Jawaharlal Nehru Univ, Sch Comp & Syst Sci, JNU Ring Rd, New Delhi 110067, India
关键词
Disease outbreak prediction; Natural language processing; Text analysis; Clustering; Machine learning; News data; Search data; Twitter data; EAST RESPIRATORY SYNDROME; SOCIAL MEDIA; SOUTH-KOREA; SURVEILLANCE; INTELLIGENCE; TWITTER; EBOLA; COVID-19; SYSTEMS;
D O I
10.1007/s10115-024-02192-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research on disease outbreak prediction has suddenly received an enormous interest owing to the COVID-19 pandemic. Natural language processing using user-generated text data has proven to be quite effective for the same. Disease outbreaks that occur frequently can be easily predicted, but novel disease outbreaks are difficult to predict. This review work attempts to summarize the research concerning disease outbreaks and the use of datasets such as news headlines, tweets, and search engine queries using natural language processing techniques. Existing state-of-the-art systems have been analytically discussed with their contributions and limitations. This work is an insight into the existing research in the domain of disease outbreak prediction. A total of 146 articles were reviewed in this study, and results show that news and Twitter datasets are being used most to predict disease outbreaks. This research underlines the fact that numerous works are available in the literature based on specific outbreak-related Internet-sourced text data, viz. news, tweets, and search engine queries. However, this becomes a limitation for any disease outbreak prediction system as it can predict only specific disease outbreaks and motivates the development of systems capable of disease outbreak prediction without any bias.
引用
收藏
页码:6561 / 6595
页数:35
相关论文
共 186 条
  • [41] Large-scale machine learning of media outlets for understanding public reactions to nation-wide viral infection outbreaks
    Choi, Sungwoon
    Lee, Jangho
    Kang, Min-Gyu
    Min, Hyeyoung
    Chang, Yoon-Seok
    Yoon, Sungroh
    [J]. METHODS, 2017, 129 : 50 - 59
  • [42] A multilingual ontology for infectious disease surveillance: rationale, design and challenges
    Collier, Nigel
    Kawazoe, Ai
    Jin, Lihua
    Shigematsu, Mika
    Dien, Dinh
    Barrero, Roberto A.
    Takeuchi, Koichi
    Kawtrakul, Asanee
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2006, 40 (3-4) : 405 - 413
  • [43] What's unusual in online disease outbreak news?
    Collier, Nigel
    [J]. JOURNAL OF BIOMEDICAL SEMANTICS, 2010, 1
  • [44] BioCaster: detecting public health rumors with a Web-based text mining system
    Collier, Nigel
    Doan, Son
    Kawazoe, Ai
    Goodwin, Reiko Matsuda
    Conway, Mike
    Tateno, Yoshio
    Quoc-Hung Ngo
    Dinh Dien
    Kawtrakul, Asanee
    Takeuchi, Koichi
    Shigematsu, Mika
    Taniguchi, Kiyosu
    [J]. BIOINFORMATICS, 2008, 24 (24) : 2940 - 2941
  • [45] Command and Control Center, US
  • [46] Dai XF, 2017, IEEE SOUTHEASTCON
  • [47] Dansana Debabrata, 2022, International Journal of Reliable and Quality E-Healthcare, V11, P1, DOI 10.4018/IJRQEH.297075
  • [48] A comparative study on predicting influenza outbreaks using different feature spaces: application of influenza-like illness data from Early Warning Alert and Response System in Syria
    Darwish, Ali
    Rahhal, Yasser
    Jafar, Assef
    [J]. BMC RESEARCH NOTES, 2020, 13 (01)
  • [49] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [50] 2-9