Comprehensive assessment of E. coli dynamics in river water using advanced machine learning and explainable AI

被引:0
|
作者
Mallik, Santanu [1 ,2 ]
Saha, Bodhipriya [2 ]
Podder, Krishanu [3 ]
Muthuraj, Muthusivaramapandian [4 ]
Mishra, Umesh [2 ]
Deb, Sharbari [5 ]
机构
[1] Poornima Coll Engn, Dept Civil Engn, Jaipur 302022, Rajasthan, India
[2] Natl Inst Technol Agartala, Dept Civil Engn, Jirania 799046, Tripura, India
[3] Govt Tripura, Dept Elementary Educ, Agartala, India
[4] Natl Inst Technol Agartala, Dept Bioengn, Jirania 799046, Tripura, India
[5] Poornima Univ, Dept Elect & Comp Engn, Jaipur 303905, Rajasthan, India
关键词
E; coli; Land use; QMRA; Automatic machine learning algorithm; Explainable artificial intelligence; RISK-ASSESSMENT; LAND-USE; QUALITY;
D O I
10.1016/j.psep.2025.106816
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The discharge of untreated municipal wastewater has resulted in faecal contamination of river water, posing severe public health risks, and has challenged safe irrigation. Therefore, the present study quantified the Escherichia coli (E. coli) contamination in three rivers of the Tripura region and assessed the impact of land use (LU) patterns on E. coli dynamics using spatial distribution maps. Further, the Quantitative Microbial Risk Assessment (QMRA) model is utilized to evaluate microbial risks associated with farmers using contaminated river water for irrigation. Finally, this study is the first of its kind to use and compare three hyper-tuning frameworks, which included Bayesian optimization, Tree-based Pipeline Optimization Tool, and Optuna, to predict E. coli concentration. This work also utilizes the Explainable AI (XAI) based Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) for global and local site-specific sensitivity analyses, providing interpretable and actionable insights. The findings show that water quality in all three rivers is unsuitable for drinking primarily due to elevated E. coli levels. Stable pH levels and favorable temperatures support E. coli growth, intensifying the contamination risk. The QMRA model further indicates a 0.01- 0.57 probability of significant health risks for farmers using contaminated water. Additionally, the machine learning approaches, along with statistical metrics and cumulative density function plots, reveal the superior performance of the Optuna-optimized extreme gradient-boosting (XGBoost) model over the random forest (RF) and gradient-boosting machine models (GBM). XAI recognized electrical conductivity and total dissolved solids as the most influential factors affecting the E. coli concentrations. Overall, this framework can predict regions impacted by faecal contamination, supporting the sustainable development goals for clean water and health.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Predicting the presence of E. coli in tap water using machine learning in Nepal
    Kuroki, So
    Ogata, Ryuji
    Sakamoto, Maiko
    WATER AND ENVIRONMENT JOURNAL, 2023, 37 (03) : 402 - 411
  • [2] Prediction of E. coli Concentrations in Agricultural Pond Waters: Application and Comparison of Machine Learning Algorithms
    Stocker, Matthew D.
    Pachepsky, Yakov A.
    Hill, Robert L.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 4
  • [3] Machine learning and explainable AI for chlorophyll-a prediction in Namhan River Watershed, South Korea
    Han, Ji Woo
    Kim, TaeHo
    Lee, Sangchul
    Kang, Taegu
    Im, Jong Kwon
    ECOLOGICAL INDICATORS, 2024, 166
  • [4] Prediction of Boundary and Stormwater E. Coli Concentrations Using River Flows and Baseflow Index
    Jagupilla, Sarath Chandra K.
    Shah, Vishwa
    Ramaswamy, Venkatsundar
    Gurumurthy, Praneeth
    Vaccari, David A.
    JOURNAL OF ENVIRONMENTAL ENGINEERING, 2020, 146 (04)
  • [5] Interpretability Versus Accuracy: A Comparison of Machine Learning Models Built Using Different Algorithms, Performance Measures, and Features to Predict E. coli Levels in Agricultural Water
    Weller, Daniel L.
    Love, Tanzy M. T.
    Wiedmann, Martin
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [6] Reliable Autism Spectrum Disorder Diagnosis for Pediatrics Using Machine Learning and Explainable AI
    Jeon, Insu
    Kim, Minjoong
    So, Dayeong
    Kim, Eun Young
    Nam, Yunyoung
    Kim, Seungsoo
    Shim, Sehoon
    Kim, Joungmin
    Moon, Jihoon
    DIAGNOSTICS, 2024, 14 (22)
  • [7] Temporal Dynamics and Predictive Modelling of Streamflow and Water Quality Using Advanced Statistical and Ensemble Machine Learning Techniques
    Farzana, Syeda Zehan
    Paudyal, Dev Raj
    Chadalavada, Sreeni
    Alam, Md Jahangir
    WATER, 2024, 16 (15)
  • [8] DISINFECTION OF E. coli CONTAMINATED WATER USING TUNGSTEN TRIOXIDE-BASED PHOTOELECTROCATALYSIS
    Scott-Emuakpor, Efetobor
    Paton, Graeme I.
    Todd, Malcolm J.
    Macphee, Donald E.
    ENVIRONMENTAL ENGINEERING AND MANAGEMENT JOURNAL, 2016, 15 (04): : 899 - 903
  • [9] Explainable AI-driven machine learning for heart disease detection using ECG signal
    Majhi, Babita
    Kashyap, Aarti
    APPLIED SOFT COMPUTING, 2024, 167
  • [10] Fast identification and susceptibility determination of E. coli isolated directly from patients' urine using infrared-spectroscopy and machine learning
    Abu-Aqil, George
    Suleiman, Manal
    Sharaha, Uraib
    Riesenberg, Klaris
    Lapidot, Itshak
    Huleihel, Mahmoud
    Salman, Ahmad
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2023, 285