BioCaster: detecting public health rumors with a Web-based text mining system

被引:139
|
作者
Collier, Nigel [1 ,2 ]
Doan, Son [1 ]
Kawazoe, Ai [1 ]
Goodwin, Reiko Matsuda [1 ,3 ]
Conway, Mike [1 ]
Tateno, Yoshio [4 ]
Quoc-Hung Ngo
Dinh Dien
Kawtrakul, Asanee [5 ]
Takeuchi, Koichi [6 ]
Shigematsu, Mika [7 ]
Taniguchi, Kiyosu [7 ]
机构
[1] ROIS, Natl Inst Informat, Tokyo 1018430, Japan
[2] Japan Sci & Technol Corp, PRESTO, Tokyo 1018430, Japan
[3] CUNY Herbert H Lehman Coll, Dept Anthropol, Bronx, NY 10468 USA
[4] ROIS, Natl Inst Genet, Mishima, Shizuoka 4118540, Japan
[5] Kasetsart Univ, Dept Comp Engn, NECTEC, Bangkok, Thailand
[6] Okayama Univ, Okayama 7008530, Japan
[7] Natl Inst Infect Dis, Tokyo 1628640, Japan
基金
日本学术振兴会; 日本科学技术振兴机构;
关键词
D O I
10.1093/bioinformatics/btn534
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles.
引用
收藏
页码:2940 / 2941
页数:2
相关论文
共 50 条
  • [1] A web-based information system for public health
    Pereira, Octavio
    Luis, Tiago
    NOVAS PERSPECTIVAS EM SISTEMAS E TECNOLOGIAS DE INFORMACAO, VOL I, 2007, : 433 - 444
  • [2] A framework of web-based text mining on the grid
    Yu, L
    Wang, SY
    Lai, KK
    Wu, Y
    INTERNATIONAL CONFERENCE ON NEXT GENERATION WEB SERVICES PRACTICES, 2005, : 97 - 102
  • [3] Intrusion detection using Text Mining in a web-based telemedicine system
    Adeva, JJG
    Pikatza, JM
    Flórez, S
    Sobrado, FJ
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 1009 - 1014
  • [4] The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews
    Hao, Haijing
    Zhang, Kunpeng
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2016, 18 (05)
  • [5] Using Data Mining Techniques for Detecting Dependencies in the Outcoming Data of a Web-Based System
    Rak, Tomasz
    Zyla, Rafal
    APPLIED SCIENCES-BASEL, 2022, 12 (12):
  • [6] Identify group roles by text mining on group discussion in a web-based learning system
    Ou, KL
    Wang, CY
    Chen, GD
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 5566 - 5572
  • [7] Comparison of Web-Based Biosecurity Intelligence Systems: BioCaster, EpiSPIDER and HealthMap
    Lyon, A.
    Nunn, M.
    Grossel, G.
    Burgman, M.
    TRANSBOUNDARY AND EMERGING DISEASES, 2012, 59 (03) : 223 - 232
  • [8] PubTator: a web-based text mining tool for assisting biocuration
    Wei, Chih-Hsuan
    Kao, Hung-Yu
    Lu, Zhiyong
    NUCLEIC ACIDS RESEARCH, 2013, 41 (W1) : W518 - W522
  • [9] A Web-based Text Simplification System for English
    Ferres, Daniel
    Marimon, Montserrat
    Saggion, Horacio
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (55): : 191 - 194
  • [10] BioTextQuest: a web-based biomedical text mining suite for concept discovery
    Papanikolaou, Nikolas
    Pafilis, Evangelos
    Nikolaou, Stavros
    Ouzounis, Christos A.
    Iliopoulos, Ioannis
    Promponas, Vasilis J.
    BIOINFORMATICS, 2011, 27 (23) : 3327 - 3328