Prediction of Pseudomonas aeruginosa abundance in drinking water distribution systems using machine learning

被引:0
|
作者
Zhou, Qiaomei [1 ]
Li, Yukang [2 ]
Wang, Min [2 ]
Huang, Jingang [1 ,3 ]
Li, Weishuai [1 ]
Qiu, Shanshan [1 ]
Wang, Haibo [2 ]
机构
[1] Hangzhou Dianzi Univ, Coll Mat & Environm Engn, Hangzhou 310018, Peoples R China
[2] Chinese Acad Sci, Res Ctr Ecoenvironm Sci, Key Lab Drinking Water Sci & Technol, Beijing 100085, Peoples R China
[3] Hangzhou Dianzi Univ, China Austria Belt & Rd Joint Lab Artificial Intel, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; Pseudomonas aeruginosa; Drinking water; Feature selection; Model validation; OPTIMIZATION; SELECTION;
D O I
10.1016/j.psep.2024.11.099
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The detection of Pseudomonas aeruginosa is a challenging but crucial task to ensure the bio-safety of drinking water. The current cultivation and molecular qPCR methods are costly, laborious and time-consuming, leading to inaccuracies and delayed monitoring. In this study, three machine learning (ML) models, including eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Support Vector Regression (SVR), were developed, interpreted, and validated for their ability to predict P. aeruginosa abundance in both urban and rural drinking water distribution systems (DWDS). To ensure the reliability and robustness of ML models, data leakage management for data pre-processing, 5-fold cross-validation and grid search for hyperparameters tuning were utilized during the training phase. To control overfitting issues, feature selection using embedded method was implemented to exclude three low-contributing input variables of oxidation-reduction potential (ORP), total chlorine, and heterotrophic plate counts (HPC). The XGBoost model outperformed RF and SVR models in terms of accuracy and generalizability in predicting P. aeruginosa abundance, achieving training/testing R2 of 0.92/ 0.85 in urban system, and 0.94/0.87 in rural system, respectively. Feature importance analysis revealed that water temperature, dissolved oxygen (DO), residual chlorine, and NO3--N were key variables for the prediction. The validation experiments, by randomly sampling from both urban and rural DWDS, demonstrated acceptable relative errors of 10.77 % and 8.86 %, respectively. Overall, this study provides an applicable ML modeling framework for the accurate and fast prediction of P. aeruginosa abundance in DWDS, potentially reducing laborious experiments in future.
引用
收藏
页码:1050 / 1060
页数:11
相关论文
共 50 条
  • [41] Rapid Visual Detection of Trace Pseudomonas aeruginosa in Packaged Drinking Water Using Nucleic Acid Test Strips
    Meng, Xianzhuo
    Yan, Chao
    Zhang, Jing
    Yao, Bangben
    Chen, Zhaoran
    Yang, Qingli
    Chen, Wei
    Shipin Kexue/Food Science, 2024, 45 (15): : 229 - 236
  • [42] TargIDe: a machine -learning workflow for target identification of molecules with antibiofilm activity against Pseudomonas aeruginosa
    Carneiro, Joao
    Magalhaes, Rita P.
    Roque, Victor de la Oliva M.
    Simoes, Manuel
    Pratas, Diogo
    Sousa, Sergio F.
    JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2023, 37 (5-6) : 265 - 278
  • [43] Quality Monitoring of Abu Dhabi Drinking Water Using Machine Learning Classifiers
    Hasan, Ali N.
    Alhammadi, Khawla M.
    2021 14TH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE), 2021, : 1 - 6
  • [44] Opportunistic Pathogens in Drinking Water Distribution Systems-A Review
    Lechevallier, Mark W.
    Prosser, Toby
    Stevens, Melita
    MICROORGANISMS, 2024, 12 (05)
  • [45] Evaluation of Machine Learning Algorithm on Drinking Water Quality for Better Sustainability
    Kaddoura, Sanaa
    SUSTAINABILITY, 2022, 14 (18)
  • [46] Robust output prediction of differential - algebraic systems - application to drinking water distribution system
    Ciminski, Arkadiusz
    Duzinkiewicz, Kazimierz
    2015 20TH INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2015, : 1133 - 1138
  • [47] Efficient Water Quality Prediction Using Supervised Machine Learning
    Ahmed, Umair
    Mumtaz, Rafia
    Anwar, Hirra
    Shah, Asad A.
    Irfan, Rabia
    Garcia-Nieto, Jose
    WATER, 2019, 11 (11)
  • [48] Pseudomonas aeruginosa in hospital water systems: biofilms, guidelines, and practicalities
    Walker, J.
    Moore, G.
    JOURNAL OF HOSPITAL INFECTION, 2015, 89 (04) : 324 - 327
  • [49] Recognizing Safe Drinking Water and Predicting Water Quality Index using Machine Learning Framework
    Torky, Mohamed
    Bakhiet, Ali
    Bakrey, Mohamed
    Ismail, Ahmed Adel
    EL Seddawy, Ahmed I. B.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (01) : 23 - 33
  • [50] Water Quality Assessment using Machine Learning: A Focus on Coliform Prediction in Water
    Kaur, Ishleen
    Gulati, Archa
    Lamba, Puneet Singh
    Jain, Achin
    Taneja, Harsh
    Syal, Jessica Singh
    ASIAN JOURNAL OF WATER ENVIRONMENT AND POLLUTION, 2024, 21 (05) : 19 - 26