Appraisal of machine learning techniques for predicting emerging disinfection byproducts in small water distribution networks

被引:13
|
作者
Hu, Guangji [1 ,2 ,4 ]
Mian, Haroon R. [2 ]
Mohammadiun, Saeed [2 ]
Rodriguez, Manuel J. [3 ]
Hewage, Kasun [2 ]
Sadiq, Rehan [2 ]
机构
[1] Qingdao Univ, Sch Environm Sci & Engn, Qingdao 266071, Shandong, Peoples R China
[2] Univ British Columbia Okanagan, Sch Engn, 3333 Univ Way, Kelowna, BC V1V 1V7, Canada
[3] Bibliotheque Univ Laval, Ecole Super Amenagement Terr & Dev Reg ESAD, 2325, Quebec City, PQ G1V 0A6, Canada
[4] Qingdao Univ, Sch Environm Sci & Engn, 308 Ningxia Rd, Qingdao 266071, Shandong, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
Emerging disinfection byproducts; Water quality modeling; Small water distribution networks; Support vector regression; Neural networks; DRINKING-WATER; DBPS; REGRESSION; REGION; MODELS;
D O I
10.1016/j.jhazmat.2022.130633
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Monitoring emerging disinfection byproducts (DBPs) is challenging for many small water distribution networks (SWDNs), and machine learning-based predictive modeling could be an alternative solution. In this study, eleven machine learning techniques, including three multivariate linear regression-based, three regression tree-based, three neural networks-based, and two advanced non-parametric regression techniques, are used to develop models for predicting three emerging DBPs (dichloroacetonitrile, chloropicrin, and trichloropropanone) in SWDNs. Predictors of the models include commonly-measured water quality parameters and two conventional DBP groups. Sampling data of 141 cases were collected from eleven SWDNs in Canada, in which 70 % were randomly selected for model training and the rest were used for validation. The modeling process was reiterated 1000 times for each model. The results show that models developed using advanced regression techniques, including support vector regression and Gaussian process regression, exhibited the best prediction performance. Support vector regression models showed the highest prediction accuracy (R2 = 0.94) and stability for predicting dichloroacetonitrile and trichloropropanone, and Gaussian process regression models are optimal for predicting chloropicrin (R2 = 0.92). The difference is likely due to the much lower concentrations of chloropicrin than dichloroacetonitrile and trichloropropanone. Advanced non-parametric regression techniques, characterized by a probabilistic nature, were identified as most suitable for developing the predictive models, followed by neural network-based (e.g., generalized regression neural network), regression tree-based (e.g., random forest), and multivariate linear regression-based techniques. This study identifies promising machine learning techniques among many commonly-used alternatives for monitoring emerging DBPs in SWDNs under data constraints.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] An Overview on Application of Machine Learning Techniques in Optical Networks
    Musumeci, Francesco
    Rottondi, Cristina
    Nag, Avishek
    Macaluso, Irene
    Zibar, Darko
    Ruffini, Marco
    Tornatore, Massimo
    IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2019, 21 (02): : 1383 - 1408
  • [42] Predicting the formation of disinfection by-products using multiple linear and machine learning regression
    Peng, Fangyuan
    Lu, Yi
    Wang, Yingyang
    Yang, Long
    Yang, Zhaoguang
    Li, Haipu
    JOURNAL OF ENVIRONMENTAL CHEMICAL ENGINEERING, 2023, 11 (05):
  • [43] Machine learning-Predicting Ames mutagenicity of small molecules
    Chu, Charmaine S. M.
    Simpson, Jack D.
    O'Neill, Paul M.
    Berry, Neil G.
    JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2021, 109
  • [44] A Machine Learning Approach to Predicting Coverage in Random Wireless Networks
    El Hammouti, Hajar
    Ghogho, Mounir
    Zaidi, Syed Ali Raza
    2018 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2018,
  • [45] Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal
    Vagliano, Iacopo
    Chesnaye, Nicholas C.
    Leopold, Jan Hendrik
    Jager, Kitty J.
    Abu-Hanna, Ameen
    Schut, Martijn C.
    CLINICAL KIDNEY JOURNAL, 2022, 15 (12) : 2266 - 2280
  • [46] Predicting time series of railway speed restrictions with time-dependent machine learning techniques
    Fink, Olga
    Zio, Enrico
    Weidmann, Ulrich
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (15) : 6033 - 6040
  • [47] Machine Learning Techniques for Predicting Metamaterial Microwave Absorption Performance: A Comparison
    Jain, Prince
    Chhabra, Himanshu
    Chauhan, Urvashi
    Prakash, Krishna
    Samant, Piyush
    Singh, Dhiraj Kumar
    Soliman, Mohamed S.
    Islam, Mohammad Tariqul
    IEEE ACCESS, 2023, 11 : 128774 - 128783
  • [48] Predicting saturation pressure of reservoir fluids using machine learning techniques
    Ali, Faizan
    Khan, Muhammad Arqam
    Haider, Ghulam
    Adnan-ul Haque, Syed
    Nadeem, Ayesha
    Arif, Neha
    PETROLEUM SCIENCE AND TECHNOLOGY, 2023, 41 (10) : 1039 - 1059
  • [49] Data Balancing Techniques for Predicting Student Dropout Using Machine Learning
    Mduma, Neema
    DATA, 2023, 8 (03)
  • [50] Appraisal of Numerous Machine Learning Techniques for the Prediction of Consolidation Settlement Subjected to Placing of the Fill
    Mustafa, Rashid
    Kumar, Krishna
    Shankar, Ravi
    TRANSPORTATION INFRASTRUCTURE GEOTECHNOLOGY, 2025, 12 (04)