Prediction of Pseudomonas aeruginosa abundance in drinking water distribution systems using machine learning

被引:0
|
作者
Zhou, Qiaomei [1 ]
Li, Yukang [2 ]
Wang, Min [2 ]
Huang, Jingang [1 ,3 ]
Li, Weishuai [1 ]
Qiu, Shanshan [1 ]
Wang, Haibo [2 ]
机构
[1] Hangzhou Dianzi Univ, Coll Mat & Environm Engn, Hangzhou 310018, Peoples R China
[2] Chinese Acad Sci, Res Ctr Ecoenvironm Sci, Key Lab Drinking Water Sci & Technol, Beijing 100085, Peoples R China
[3] Hangzhou Dianzi Univ, China Austria Belt & Rd Joint Lab Artificial Intel, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; Pseudomonas aeruginosa; Drinking water; Feature selection; Model validation; OPTIMIZATION; SELECTION;
D O I
10.1016/j.psep.2024.11.099
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The detection of Pseudomonas aeruginosa is a challenging but crucial task to ensure the bio-safety of drinking water. The current cultivation and molecular qPCR methods are costly, laborious and time-consuming, leading to inaccuracies and delayed monitoring. In this study, three machine learning (ML) models, including eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Support Vector Regression (SVR), were developed, interpreted, and validated for their ability to predict P. aeruginosa abundance in both urban and rural drinking water distribution systems (DWDS). To ensure the reliability and robustness of ML models, data leakage management for data pre-processing, 5-fold cross-validation and grid search for hyperparameters tuning were utilized during the training phase. To control overfitting issues, feature selection using embedded method was implemented to exclude three low-contributing input variables of oxidation-reduction potential (ORP), total chlorine, and heterotrophic plate counts (HPC). The XGBoost model outperformed RF and SVR models in terms of accuracy and generalizability in predicting P. aeruginosa abundance, achieving training/testing R2 of 0.92/ 0.85 in urban system, and 0.94/0.87 in rural system, respectively. Feature importance analysis revealed that water temperature, dissolved oxygen (DO), residual chlorine, and NO3--N were key variables for the prediction. The validation experiments, by randomly sampling from both urban and rural DWDS, demonstrated acceptable relative errors of 10.77 % and 8.86 %, respectively. Overall, this study provides an applicable ML modeling framework for the accurate and fast prediction of P. aeruginosa abundance in DWDS, potentially reducing laborious experiments in future.
引用
收藏
页码:1050 / 1060
页数:11
相关论文
共 50 条
  • [21] Disruption of Pseudomonas aeruginosa Adherent Cells by NaCl and NaOCl in Drinking Water
    Elgoulli, Mourad
    Zahir, Hafida
    Ellouali, Mostafa
    Latrache, Hassan
    CURRENT MICROBIOLOGY, 2023, 80 (05)
  • [22] Integration of Pseudomonas aeruginosa and Legionella pneumophila in drinking water biofilms grown on domestic plumbing materials
    Moritz, Miriam M.
    Flemming, Hans-Curt
    Wingender, Jost
    INTERNATIONAL JOURNAL OF HYGIENE AND ENVIRONMENTAL HEALTH, 2010, 213 (03) : 190 - 197
  • [23] Applications of machine learning in drinking water quality management: A critical review on water distribution system
    Li, Zhaopeng
    Ma, Wencheng
    Zhong, Dan
    Ma, Jun
    Zhang, Qingzhou
    Yuan, Yongqin
    Liu, Xiaofei
    Wang, Xiaodong
    Zou, Kangbing
    JOURNAL OF CLEANER PRODUCTION, 2024, 481
  • [24] Predicting few disinfection byproducts in the water distribution systems using machine learning models
    Shakhawat Chowdhury
    Karim Asif Sattar
    Syed Masiur Rahman
    Environmental Science and Pollution Research, 2025, 32 (7) : 3776 - 3794
  • [25] Mechanisms of survival mediated by the stringent response in Pseudomonas aeruginosa under environmental stress in drinking water systems: Nitrogen deficiency and bacterial competition
    Wang, Xu
    Wang, Jing
    Liu, Shao-Yang
    Guo, Jin-Song
    Fang, Fang
    Chen, You-Peng
    Yan, Peng
    JOURNAL OF HAZARDOUS MATERIALS, 2023, 448
  • [26] Forecasting bacteriological presence in treated drinking water using machine learning
    Kyritsakas, Grigorios
    Boxall, Joby
    Speight, Vanessa
    FRONTIERS IN WATER, 2023, 5
  • [27] Predicting antimicrobial resistance in Pseudomonas aeruginosa with machine learning-enabled molecular diagnostics
    Khaledi, Ariane
    Weimann, Aaron
    Schniederjans, Monika
    Asgari, Ehsaneddin
    Kuo, Tzu-Hao
    Oliver, Antonio
    Cabot, Gabriel
    Kola, Axel
    Gastmeier, Petra
    Hogardt, Michael
    Jonas, Daniel
    Mofrad, Mohammad R. K.
    Bremges, Andreas
    McHardy, Alice C.
    Haeussler, Susanne
    EMBO MOLECULAR MEDICINE, 2020, 12 (03)
  • [28] Prediction of drinking water quality with machine learning models: A public health nursing approach
    Ozsezer, Gozde
    Mermer, Gulengul
    PUBLIC HEALTH NURSING, 2024, 41 (01) : 175 - 191
  • [29] Cost prediction for water reuse equipment using interpretable machine learning models
    Chen, Kan
    Zhang, Yuezheng
    Hu, Naixin
    Ye, Chao
    Ma, Ji
    Zheng, Tong
    JOURNAL OF WATER PROCESS ENGINEERING, 2024, 63
  • [30] A Machine Learning Approach to College Drinking Prediction and Risk Factor Identification
    Bi, Jinbo
    Sun, Jiangwen
    Wu, Yu
    Tennen, Howard
    Armeli, Stephen
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 4 (04)