Leptospirosis modelling using hydrometeorological indices and random forest machine learning

被引:0
|
作者
Veianthan Jayaramu
Zed Zulkafli
Simon De Stercke
Wouter Buytaert
Fariq Rahmat
Ribhan Zafira Abdul Rahman
Asnor Juraiza Ishak
Wardah Tahir
Jamalludin Ab Rahman
Nik Mohd Hafiz Mohd Fuzi
机构
[1] Universiti Putra Malaysia,Department of Civil Engineering
[2] Imperial College London,Department of Civil and Environmental Engineering
[3] Universiti Putra Malaysia,Department of Electrical and Electronic Engineering
[4] Universiti Teknologi Mara,Flood Control Research Group, Faculty of Civil Engineering
[5] International Islamic University Malaysia,Department of Community Medicine, Kulliyyah of Medicine
[6] Ministry of Health Malaysia,Kelantan State Health Department
关键词
Leptospirosis; Hydrometeorological indices; Cross-correlation analysis; Random forest; Variable importance; Feature selection;
D O I
暂无
中图分类号
学科分类号
摘要
Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use of a random forest classifier was explored to analyze the relative importance of hydrometeorological indices in developing the leptospirosis model and to evaluate the performance of models based on the type of indices used, using case data from three districts in Kelantan, Malaysia, that experience annual monsoonal rainfall and flooding. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity, and temperature were transformed into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, weekly case occurrences were classified into binary classes “high” and “low” based on an average threshold. Seventeen models based on “average,” “extreme,” and “mixed” indices were trained by optimizing the feature subsets based on the model computed mean decrease Gini (MDG) scores. The variable importance was assessed through cross-correlation analysis and the MDG score. The average and extreme models showed similar prediction accuracy ranges (61.5–76.1% and 72.3–77.0%) while the mixed models showed an improvement (71.7–82.6% prediction accuracy). An extreme model was the most sensitive while an average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The rainfall variable (extreme) was the most important in classifying the leptospirosis occurrence while streamflow was the least important despite showing higher correlations with leptospirosis.
引用
收藏
页码:423 / 437
页数:14
相关论文
共 50 条
  • [1] Leptospirosis modelling using hydrometeorological indices and random forest machine learning
    Jayaramu, Veianthan
    Zulkafli, Zed
    De Stercke, Simon
    Buytaert, Wouter
    Rahmat, Fariq
    Rahman, Ribhan Zafira Abdul
    Ishak, Asnor Juraiza
    Tahir, Wardah
    Ab Rahman, Jamalludin
    Fuzi, Nik Mohd Hafiz Mohd
    INTERNATIONAL JOURNAL OF BIOMETEOROLOGY, 2023, 67 (03) : 423 - 437
  • [2] Random Forest Modelling for Cardiotocography Data: A Case Study on Machine Learning with SparkR
    Kamath, R. S.
    Kamat, R. K.
    RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2016, 7 (06): : 584 - 590
  • [3] Classification of Phishing Email Using Random Forest Machine Learning Technique
    Akinyelu, Andronicus A.
    Adewumi, Aderemi O.
    JOURNAL OF APPLIED MATHEMATICS, 2014,
  • [4] House Price Prediction using Random Forest Machine Learning Technique
    Adetunji, Abigail Bola
    Akande, Oluwatobi Noah
    Ajala, Funmilola Alaba
    Oyewo, Ololade
    Akande, Yetunde Faith
    Oluwadara, Gbenle
    8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT (ITQM 2020 & 2021): DEVELOPING GLOBAL DIGITAL ECONOMY AFTER COVID-19, 2022, 199 : 806 - 813
  • [5] Predicting hydrogen and oxygen indices (HI, OI) from conventional well logs using a Random Forest machine learning algorithm
    Gordon, John B.
    Sanei, Hamed
    Pedersen, Per K.
    INTERNATIONAL JOURNAL OF COAL GEOLOGY, 2022, 249
  • [6] Prediction of size and mass of pistachio kernels using random Forest machine learning
    Vidyarthi, Sriram K.
    Tiwari, Rakhee
    Singh, Samrendra K.
    Xiao, Hong-Wei
    JOURNAL OF FOOD PROCESS ENGINEERING, 2020, 43 (09)
  • [7] Indication of Health Status Using Machine Learning Linear Regression and Random Forest
    Asif, Arslan
    Nabeel, Muhammad
    Awan, Mazhar Javed
    Ahsan, Muhammad
    Hannan, Abdul
    Abbas, Shahroz
    4TH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING (IC)2, 2021, : 464 - 469
  • [8] Estimation of Soil Cohesion Using Machine Learning Method: A Random Forest Approach
    Ly, Hai-Bang
    Nguyen, Thuy-Anh
    Pham, Binh Thai
    ADVANCES IN CIVIL ENGINEERING, 2021, 2021
  • [9] Land subsidence susceptibility assessment using random forest machine learning algorithm
    Majid Mohammady
    Hamid Reza Pourghasemi
    Mojtaba Amiri
    Environmental Earth Sciences, 2019, 78
  • [10] A machine learning approach using random forest and LASSO to predict wine quality
    Athanasiadis, Ioannis
    Ioannides, Dimitrios
    INTERNATIONAL JOURNAL OF SUSTAINABLE AGRICULTURAL MANAGEMENT AND INFORMATICS, 2021, 7 (03) : 232 - 251