PADI-web: An Event-Based Surveillance System for Detecting, Classifying and Processing Online News

被引:2
作者
Valentin, Sarah [1 ,2 ,3 ]
Arsevska, Elena [2 ,3 ]
Mercier, Alize [2 ,3 ]
Falala, Sylvain [2 ]
Rabatel, Julien [3 ]
Lancelot, Renaud [2 ,3 ]
Roche, Mathieu [1 ,3 ]
机构
[1] Univ Montpellier, UMR TETIS, AgroParisTech, CIRAD,CNRS,INRAE, F-34398 Montpellier, France
[2] Univ Montpellier, UMR ASTRE, CIRAD, INRAE, F-34398 Montpellier, France
[3] CIRAD, Montpellier, France
来源
HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017 | 2020年 / 12598卷
基金
欧盟地平线“2020”;
关键词
Epidemic intelligence; Animal health; Web monitoring; Text mining; Classification; Information extraction;
D O I
10.1007/978-3-030-66527-2_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Platform for Automated Extraction of Animal Disease Information from the Web (PADI-web) is a multilingual text mining tool for automatic detection, classification, and extraction of disease outbreak information from online news articles. PADI-web currently monitors the Web for nine animal infectious diseases and eight syndromes in five animal hosts. The classification module is based on a supervised machine learning approach to filter the relevant news with an overall accuracy of 0.94. The classification of relevant news between 5 topic categories (confirmed, suspected or unknown outbreak, preparedness and impact) obtained an overall accuracy of 0.75. In the first six months of its implementation (January-June 2016), PADI-web detected 73% of the outbreaks of African swine fever; 20% of foot-and-mouth disease; 13% of bluetongue, and 62% of highly pathogenic avian influenza. The information extraction module of PADI-web obtained F-scores of 0.80 for locations, 0.85 for dates, 0.95 for diseases, 0.95 for hosts, and 0.85 for case numbers. PADI-web allows complementary disease surveillance in the domain of animal health.
引用
收藏
页码:87 / 101
页数:15
相关论文
共 22 条
  • [1] Ahlers D., 2013, P 7 WORKSH GEOGR INF, P74, DOI DOI 10.1145/2533888.2533938
  • [2] [Anonymous], 2008, NATO SCI PEACE SECUR, DOI DOI 10.3233/978-1-58603-898-4-295
  • [3] Web monitoring of emerging animal infectious diseases integrated in the French Animal Health Epidemic Intelligence System
    Arsevska, Elena
    Valentin, Sarah
    Rabatel, Julien
    de Herve, Jocelyn de Goer
    Falala, Sylvain
    Lancelot, Renaud
    Roche, Mathieu
    [J]. PLOS ONE, 2018, 13 (08):
  • [4] Identification of terms for detecting early signals of emerging infectious disease outbreaks on the web
    Arsevska, Elena
    Roche, Mathieu
    Hendrikx, Pascal
    Chavernac, David
    Falala, Sylvain
    Lancelot, Renaud
    Dufour, Barbara
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2016, 123 : 104 - 115
  • [5] The Unified Medical Language System (UMLS): integrating biomedical terminology
    Bodenreider, O
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D267 - D270
  • [6] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [7] Surveillance sans frontieres: Internet-based emerging infectious disease intelligence and the HealthMap project
    Brownstein, John S.
    Freifeld, Clark C.
    Reis, Ben Y.
    Mandl, Kenneth D.
    [J]. PLOS MEDICINE, 2008, 5 (07): : 1019 - 1024
  • [8] GENI-DB: a database of global events for epidemic intelligence
    Collier, Nigel
    Doan, Son
    [J]. BIOINFORMATICS, 2012, 28 (08) : 1186 - 1188
  • [9] BioCaster: detecting public health rumors with a Web-based text mining system
    Collier, Nigel
    Doan, Son
    Kawazoe, Ai
    Goodwin, Reiko Matsuda
    Conway, Mike
    Tateno, Yoshio
    Quoc-Hung Ngo
    Dinh Dien
    Kawtrakul, Asanee
    Takeuchi, Koichi
    Shigematsu, Mika
    Taniguchi, Kiyosu
    [J]. BIOINFORMATICS, 2008, 24 (24) : 2940 - 2941
  • [10] Joachims T, 1998, EUR C MACH LEARN