Knowledge discovery in open data for epidemic disease prediction

被引:4
|
作者
Wu, ChienHsing [1 ]
Kao, Shu-Chen [2 ]
机构
[1] Natl Univ Kaohsiung, Dept Informat Management, 700 Kaohsiung Univ Rd, Kaohsiung 81148, Taiwan
[2] Kun Shan Univ, Dept Informat Management, 195 Kunda Rd, Tainan, Taiwan
关键词
Open data; Knowledge extraction; Dengue; Influenza; Enterovirus; Google trends; Indexing; Knowledge discovery; Epidemic diseases; Health care; DENGUE-FEVER; SPATIOTEMPORAL PATTERNS; DECISION TREE; MODEL; IDENTIFICATION; ASSOCIATION; INFLUENZA; CLIMATE; IMPACT; HUMIDITY;
D O I
10.1016/j.hlpt.2021.01.001
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: The research reveals the determinants associated with the epidemic diseases (dengue, influenza, and enterovirus) in Taiwan. It demonstrates the value of open data in prediction model development to support policymaking in the domain of public health care. Method: A knowledge discovery technique was employed to extract determinants from open data on epidemic diseases. The open dataset collected and integrated from Taiwan's Center for Disease Control, the Center Weather Bureau, and Google Trends includes 70,915 dengue, 34,062 enterovirus, and 52,908 influenza cases. A prediction model using the classification-oriented extraction mechanism was applied to open epidemic data, climate data, and Google Trends data. Prediction models that either included or did not include Google Trends data were compared. Prediction accuracy and simplicity of the decision rules are presented. Results: Prediction accuracy and simplicity of three diseases is acceptable when Google Trends is excluded but is slightly different when Google Trends is considered. Location (county) holds the main predictor of the three epidemic diseases. Time (month) presents the second-highest determinant for dengue, and age shows remarkable determinant for enterovirus and influenza. Mean temperature exhibits the highest entropy for dengue, time for enterovirus, and humidity for influenza. Conclusions: The number of confirmed cases for all three epidemic diseases cannot be predicted by a single variable. Knowledge extraction using the classification-oriented technique can be successfully applied in prediction model development. Google Trends data reveal a remarkable but inconsistent role in predicting three epidemic diseases with respect to prediction accuracy and simplicity of the generated decision tree. (c) 2021 Fellowship of Postgraduate Medicine. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:126 / 134
页数:9
相关论文
共 50 条
  • [21] Knowledge Discovery from Data Mining
    Lan, Tian
    EBM 2010: INTERNATIONAL CONFERENCE ON ENGINEERING AND BUSINESS MANAGEMENT, VOLS 1-8, 2010, : 4642 - 4645
  • [22] Crime Prediction on Open Data in India Using Data Mining Techniques
    Menaka, M.
    Sujatha, P.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [23] Earlier Prediction of Influenza Epidemic by Hospital-based Data in Taiwan
    Fu, Chia-liang
    Chen, Ray-jade
    Lo, Yu-sheng
    2ND INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND SYSTEMS ENGINEERING (EMSE 2017), 2017, : 403 - 407
  • [24] Knowledge Discovery for Scalable Data Mining
    Chhabra, Indu
    Suri, Gunmala
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2019, 6 (21) : 1 - 9
  • [25] Vessel Pattern Knowledge Discovery from AIS Data: A Framework for Anomaly Detection and Route Prediction
    Pallotta, Giuliana
    Vespe, Michele
    Bryan, Karna
    ENTROPY, 2013, 15 (06) : 2218 - 2245
  • [26] Knowledge Discovery Process in the Open Government Colombian Model
    Salazar Cardona, Johnny Alexander
    Gomez G, Carlos Hernan
    Lopez Trujillo, Marcelo
    2014 9TH COMPUTING COLOMBIAN CONFERENCE (9CCC), 2014, : 96 - 101
  • [27] Big Data Trend: Knowledge Discovery on the Unstructured Data
    Abu Muntalib, Shamsiah
    Sidi, Fatimah
    Jabar, Marzanah A.
    Ishak, Iskandar
    PROCEEDING OF KNOWLEDGE MANAGEMENT INTERNATIONAL CONFERENCE (KMICE) 2014, VOLS 1 AND 2, 2014, : 338 - 342
  • [28] Neural-SEIR: A flexible data-driven framework for precise prediction of epidemic disease
    Wang, Haoyu
    Qiu, Xihe
    Yang, Jinghan
    Li, Qiong
    Tan, Xiaoyu
    Huang, Jingjing
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (09) : 16807 - 16823
  • [29] Decentralized knowledge discovery using massive heterogenous data in Cognitive IoT
    Jha, Vidyapati
    Tripathi, Priyanka
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (03): : 3657 - 3682
  • [30] Intelligent Data Analysis for Knowledge Discovery, Patient Monitoring and Quality Assessment
    Peek, N.
    Swift, S.
    METHODS OF INFORMATION IN MEDICINE, 2012, 51 (04) : 318 - 322