Predicting few disinfection byproducts in the water distribution systems using machine learning models

被引:0
作者
Shakhawat Chowdhury [1 ]
Karim Asif Sattar [4 ]
Syed Masiur Rahman [2 ]
机构
[1] Department of Civil and Environmental Engineering, King Fahd University of Petroleum & Minerals, Dhahran
[2] Research Engineer I, Interdisciplinary Research Center for Smart Mobility and Logistics. King Fahd University of Petroleum & Minerals, Dhahran
[3] Research Engineer I, Applied Research Center for Environment & Marine Studies, Research Institute, King Fahd University of Petroleum & Minerals, Dhahran
[4] IRC CBM, King Fahd University of Petroleum & Minerals, Dhahran
关键词
Disinfection byproducts; Drinking water; Machine learning models; Model training and testing; Risk reduction; Water distribution system;
D O I
10.1007/s11356-025-35933-3
中图分类号
学科分类号
摘要
Concerns regarding disinfection byproducts (DBPs) in drinking water persist, with measurements in water treatment plants (WTPs) being relatively easier than those in water distribution systems (WDSs) due to accessibility challenges, especially during adverse weather conditions. Machine learning (ML) models offer improved predictions of DBPs in WDSs. This study developed multiple ML models to predict Trihalomethanes (THMs), Haloacetic Acids (HAAs), Dichloroacetonitrile (DCAN), and N-nitrosodimethylamine (NDMA) in WDSs using data collected over 13 years (2008–2020) from 113 water supply systems (WSS) in Ontario. Data were collected tri-monthly (four times/year) following Ontario's regulatory requirements. Four common ML models—linear regressor (LR), random forest regressor (RFR), support vector regressor (SVR), and artificial neural networks with multiple folds cross-validation (ANN-MV) and single fold validation (ANN-SV)—were trained and tested using different datasets. R2 values for training datasets of THMs, HAAs, DCAN, and NDMA models ranged from 0.533 to 0.976, 0.560 to 0.980, 0.602 to 0.993, and 0.449 to 0.858, respectively. For testing datasets, R2 ranged from 0.517 to 0.939, 0.437 to 0.945, 0.565 to 0.973, and 0.517 to 0.718, respectively. Among THMs, HAAs, and DCAN, ANN-SV models were identified as the best, followed by the RFR model, whereas for NDMA, SVR was the superior model, followed by the LR model. Some models reliably predicted DBPs, suggesting they could replace costly sampling and experimental analysis for DBPs in the WDSs, thereby enhancing DBPs control in WDSs and reducing human exposure and associated risks. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
引用
收藏
页码:3776 / 3794
页数:18
相关论文
共 67 条
  • [1] Abiodun O.I., Jantan A., Omolara A.E., Dada K.V., Umar A.M., Linus O.U., Kiru M.U., Comprehensive review of artificial neural network applications to pattern recognition, IEEE Access, 7, pp. 158820-158846, (2019)
  • [2] Agarap A.F., Deep learning using rectified linear units (Relu)., (2018)
  • [3] Ain Q.T., Ali M., Riaz A., Noureen A., Kamran M., Hayat B., Rehman A., Sentiment analysis using deep learning techniques: a review, Int J Adv Comput Sci Appl, 8, 6, (2017)
  • [4] Beauchamp N., La O., Simard S., Dorea C., Relationships between DBP concentrations and differential UV absorbance in full-scale conditions, Water Res, 131, pp. 110-121, (2018)
  • [5] Boyer T.H., Singer P.C., Bench-scale testing of a magnetic ion exchange resin for removal of disinfection by-product precursors, Water Res, 39, pp. 1265-1276, (2005)
  • [6] Breiman L., Random forests, Mach Learn, 45, 1, pp. 5-32, (2001)
  • [7] Chowdhury S., Champagne P., McLellan P.J., Models for predicting disinfection byproducts (DBPs) formation in drinking waters: a chronological review, Sci Total Environ, 407, 14, pp. 4189-4206, (2009)
  • [8] Chowdhury S., Alhooshani K., Karanfil T., Disinfection byproducts in swimming pool: occurrences, Implications Futur Needs Water Res, 53, pp. 68-109, (2014)
  • [9] Chowdhury S., Mazumder M.A.J., Alhooshani K., Al-Suwaiyan M., Reduction of DBPs in synthetic water by indoor techniques and its implications on exposure and health risk, Sci Total Environ, 691, pp. 621-630, (2019)
  • [10] Chowdhury S., Sattar K.A., Rahman S.M., Investigating bromide incorporation factor (BIF) and model development for predicting THMs in drinking water using machine learning, Sci Total Environment, 906, 1, (2024)