Appraisal of machine learning techniques for predicting emerging disinfection byproducts in small water distribution networks

被引:13
|
作者
Hu, Guangji [1 ,2 ,4 ]
Mian, Haroon R. [2 ]
Mohammadiun, Saeed [2 ]
Rodriguez, Manuel J. [3 ]
Hewage, Kasun [2 ]
Sadiq, Rehan [2 ]
机构
[1] Qingdao Univ, Sch Environm Sci & Engn, Qingdao 266071, Shandong, Peoples R China
[2] Univ British Columbia Okanagan, Sch Engn, 3333 Univ Way, Kelowna, BC V1V 1V7, Canada
[3] Bibliotheque Univ Laval, Ecole Super Amenagement Terr & Dev Reg ESAD, 2325, Quebec City, PQ G1V 0A6, Canada
[4] Qingdao Univ, Sch Environm Sci & Engn, 308 Ningxia Rd, Qingdao 266071, Shandong, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
Emerging disinfection byproducts; Water quality modeling; Small water distribution networks; Support vector regression; Neural networks; DRINKING-WATER; DBPS; REGRESSION; REGION; MODELS;
D O I
10.1016/j.jhazmat.2022.130633
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Monitoring emerging disinfection byproducts (DBPs) is challenging for many small water distribution networks (SWDNs), and machine learning-based predictive modeling could be an alternative solution. In this study, eleven machine learning techniques, including three multivariate linear regression-based, three regression tree-based, three neural networks-based, and two advanced non-parametric regression techniques, are used to develop models for predicting three emerging DBPs (dichloroacetonitrile, chloropicrin, and trichloropropanone) in SWDNs. Predictors of the models include commonly-measured water quality parameters and two conventional DBP groups. Sampling data of 141 cases were collected from eleven SWDNs in Canada, in which 70 % were randomly selected for model training and the rest were used for validation. The modeling process was reiterated 1000 times for each model. The results show that models developed using advanced regression techniques, including support vector regression and Gaussian process regression, exhibited the best prediction performance. Support vector regression models showed the highest prediction accuracy (R2 = 0.94) and stability for predicting dichloroacetonitrile and trichloropropanone, and Gaussian process regression models are optimal for predicting chloropicrin (R2 = 0.92). The difference is likely due to the much lower concentrations of chloropicrin than dichloroacetonitrile and trichloropropanone. Advanced non-parametric regression techniques, characterized by a probabilistic nature, were identified as most suitable for developing the predictive models, followed by neural network-based (e.g., generalized regression neural network), regression tree-based (e.g., random forest), and multivariate linear regression-based techniques. This study identifies promising machine learning techniques among many commonly-used alternatives for monitoring emerging DBPs in SWDNs under data constraints.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Assessing regulatory violations of disinfection by-products in water distribution networks using a non-compliance potential index
    Islam, Nilufar
    Sadiq, Rehan
    Rodriguez, Manuel J.
    Legay, Christelle
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2016, 188 (05)
  • [32] Machine learning techniques in river water quality modelling: a research travelogue
    Khullar, Sakshi
    Singh, Nanhey
    WATER SUPPLY, 2021, 21 (01) : 1 - 13
  • [33] Predicting Breast Screening Attendance Using Machine Learning Techniques
    Baskaran, Vikraman
    Guergachi, Aziz
    Bali, Rajeev K.
    Naguib, Raouf N. G.
    IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2011, 15 (02): : 251 - 259
  • [34] Predicting sustainable arsenic mitigation using machine learning techniques
    Singh, Sushant K.
    Taylor, Robert W.
    Pradhan, Biswajeet
    Shirzadi, Ataollah
    Binh Thai Pham
    ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY, 2022, 232
  • [35] Modeling and predicting US recessions using machine learning techniques
    Vrontos, Spyridon D.
    Galakis, John
    Vrontos, Ioannis D.
    INTERNATIONAL JOURNAL OF FORECASTING, 2021, 37 (02) : 647 - 671
  • [36] Comparison of Statistical and Machine Learning Models for Pipe Failure Modeling in Water Distribution Networks
    Marcela Giraldo-Gonzalez, Monica
    Pablo Rodriguez, Juan
    WATER, 2020, 12 (04)
  • [37] Temporal and Spatial Distribution of Disinfection Byproducts in Drinking Water Supplied to the Mega City of Vietnam and Assessment of the Associated Risks
    Dat, Nguyen Duy
    Chau, Vu Nguyen Minh
    Tran, Anh Thi Kim
    EXPOSURE AND HEALTH, 2024, 16 (01) : 119 - 134
  • [38] Machine Learning Techniques Applied To Intruder Detection In Networks
    Henao R, J. L.
    Espinosa O, J. E.
    2013 47TH INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2013,
  • [39] Spatial and seasonal variability of tap water disinfection by-products within distribution pipe networks
    Charisiadis, Pantelis
    Andra, Syam S.
    Makris, Konstantinos C.
    Christophi, Costas A.
    Skarlatos, Dimitrios
    Vamvakousis, Vasilis
    Kargald, Sophia
    Stephanou, Euripides G.
    SCIENCE OF THE TOTAL ENVIRONMENT, 2015, 506 : 26 - 35
  • [40] Predicting nuclear fuel parameters by using machine learning techniques
    Cabezas Contardo, Juan M.
    Lopez-Cortes, Xaviera A.
    Merino, Ivan
    2021 40TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2021,