Appraisal of machine learning techniques for predicting emerging disinfection byproducts in small water distribution networks

被引:13
|
作者
Hu, Guangji [1 ,2 ,4 ]
Mian, Haroon R. [2 ]
Mohammadiun, Saeed [2 ]
Rodriguez, Manuel J. [3 ]
Hewage, Kasun [2 ]
Sadiq, Rehan [2 ]
机构
[1] Qingdao Univ, Sch Environm Sci & Engn, Qingdao 266071, Shandong, Peoples R China
[2] Univ British Columbia Okanagan, Sch Engn, 3333 Univ Way, Kelowna, BC V1V 1V7, Canada
[3] Bibliotheque Univ Laval, Ecole Super Amenagement Terr & Dev Reg ESAD, 2325, Quebec City, PQ G1V 0A6, Canada
[4] Qingdao Univ, Sch Environm Sci & Engn, 308 Ningxia Rd, Qingdao 266071, Shandong, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
Emerging disinfection byproducts; Water quality modeling; Small water distribution networks; Support vector regression; Neural networks; DRINKING-WATER; DBPS; REGRESSION; REGION; MODELS;
D O I
10.1016/j.jhazmat.2022.130633
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Monitoring emerging disinfection byproducts (DBPs) is challenging for many small water distribution networks (SWDNs), and machine learning-based predictive modeling could be an alternative solution. In this study, eleven machine learning techniques, including three multivariate linear regression-based, three regression tree-based, three neural networks-based, and two advanced non-parametric regression techniques, are used to develop models for predicting three emerging DBPs (dichloroacetonitrile, chloropicrin, and trichloropropanone) in SWDNs. Predictors of the models include commonly-measured water quality parameters and two conventional DBP groups. Sampling data of 141 cases were collected from eleven SWDNs in Canada, in which 70 % were randomly selected for model training and the rest were used for validation. The modeling process was reiterated 1000 times for each model. The results show that models developed using advanced regression techniques, including support vector regression and Gaussian process regression, exhibited the best prediction performance. Support vector regression models showed the highest prediction accuracy (R2 = 0.94) and stability for predicting dichloroacetonitrile and trichloropropanone, and Gaussian process regression models are optimal for predicting chloropicrin (R2 = 0.92). The difference is likely due to the much lower concentrations of chloropicrin than dichloroacetonitrile and trichloropropanone. Advanced non-parametric regression techniques, characterized by a probabilistic nature, were identified as most suitable for developing the predictive models, followed by neural network-based (e.g., generalized regression neural network), regression tree-based (e.g., random forest), and multivariate linear regression-based techniques. This study identifies promising machine learning techniques among many commonly-used alternatives for monitoring emerging DBPs in SWDNs under data constraints.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Predicting few disinfection byproducts in the water distribution systems using machine learning models
    Shakhawat Chowdhury
    Karim Asif Sattar
    Syed Masiur Rahman
    Environmental Science and Pollution Research, 2025, 32 (7) : 3776 - 3794
  • [2] Machine learning framework for predicting cytotoxicity and identifying toxicity drivers of disinfection byproducts
    Sikder, Rabbi
    Zhang, Huichun
    Gao, Peng
    Ye, Tao
    JOURNAL OF HAZARDOUS MATERIALS, 2024, 469
  • [3] Predicting unregulated disinfection by-products in small water distribution networks: an empirical modelling framework
    Mian, Haroon R.
    Chhipi-Shrestha, Gyan
    Hewage, Kasun
    Rodriguez, Manuel J.
    Sadiq, Rehan
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2020, 192 (08)
  • [4] Predicting unregulated disinfection by-products in water distribution networks using generalized regression neural networks
    Mian, Haroon R.
    Hu, Guangji
    Hewage, Kasun
    Rodriguez, Manuel J.
    Sadiq, Rehan
    URBAN WATER JOURNAL, 2021, 18 (09) : 711 - 724
  • [5] Suspect and Nontarget Screening of Coexisting Emerging Contaminants and Aromatic Halogenated Disinfection Byproducts in Drinking Water Distribution Systems
    Gao, Quan
    Wang, Zhenyu
    Long, Wenqing
    Huang, Qiuyun
    Zhang, Jinna
    Zhang, Jin
    Hua, Pei
    Ying, Guang-Guo
    ACS ES&T WATER, 2024, 4 (08): : 3380 - 3390
  • [6] A group of emerging heterocyclic nitrogenous disinfection byproducts: Formation and cytotoxicity of halopyridinols in drinking water
    Wang, Leyi
    Zhong, Hongli
    Chen, Xueyao
    Chen, Xun
    Zhou, Qing
    Li, Aimin
    Pan, Yang
    JOURNAL OF HAZARDOUS MATERIALS, 2024, 472
  • [7] Cytotoxicity of emerging halophenylacetamide disinfection byproducts in drinking water: Mechanism and prediction
    Hu, Shaoyang
    Li, Xiangxiang
    He, Falin
    Qi, Yuntao
    Zhang, Beibei
    Liu, Rutao
    WATER RESEARCH, 2024, 256
  • [8] Optimization of disinfectant dosage for simultaneous control of lead and disinfection-byproducts in water distribution networks
    Maheshwari, Abhilasha
    Abokifa, Ahmed
    Gudi, Ravindra D.
    Biswas, Pratim
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2020, 276
  • [9] Halonaphthoquinones: A group of emerging disinfection byproducts of high toxicity in drinking water
    Jiang, Hangcheng
    Kaw, Han Yeong
    Zhu, Lizhong
    Wang, Wei
    WATER RESEARCH, 2022, 217
  • [10] Emerging nitrogenous disinfection byproducts: Transformation of the antidiabetic drug metformin during chlorine disinfection of water
    Armbruster, Dominic
    Happel, Oliver
    Scheurer, Marco
    Harms, Klaus
    Schmidt, Torsten C.
    Brauch, Heinz-Juergen
    WATER RESEARCH, 2015, 79 : 104 - 118