Advancing water quality assessment and prediction using machine learning models, coupled with explainable artificial intelligence (XAI) techniques like shapley additive explanations (SHAP) for interpreting the black-box nature

被引:44
作者
Makumbura, Randika K. [1 ]
Mampitiya, Lakindu [1 ]
Rathnayake, Namal [2 ]
Meddage, D. P. P. [3 ]
Henna, Shagufta [4 ]
Dang, Tuan Linh [5 ]
Hoshino, Yukinobu [6 ]
Rathnayake, Upaka [7 ]
机构
[1] Water Resources Management & Soft Comp Res Lab, Millennium City 10150, Athurugiriya, Sri Lanka
[2] Univ Tokyo, Fac Engn, Dept Civil Engn, 1 Chome 1-1 Yayoi, Bunkyo City, Tokyo 1138656, Japan
[3] Univ New South Wales, Sch Engn & Informat Technol, Canberra, ACT, Australia
[4] Atlantic Technol Univ, Dept Comp, Letterkenny F92 FC93, Ireland
[5] Hanoi Univ Sci & Technol, Sch Informat & Commun Technol, 1 Dai Co Viet Rd, Hanoi 100000, Vietnam
[6] Kochi Univ Technol, Sch Syst Engn, 185 Miyanokuchi, Kami, Kochi 7828502, Japan
[7] Atlantic Technol Univ, Fac Engn & Design, Dept Civil Engn & Construct, Sligo F91 YW50, Ireland
关键词
Water quality assessment; Machine learning; Explainable artificial intelligence; Shapley additive explanations; Prediction models; SCATTER PLOT;
D O I
10.1016/j.rineng.2024.102831
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Water quality assessment and prediction play crucial roles in ensuring the sustainability and safety of freshwater resources. This study aims to enhance water quality assessment and prediction by integrating advanced machine learning models with XAI techniques. Traditional methods, such as the water quality index, often require extensive data collection and laboratory analysis, making them resource-intensive. The weighted arithmetic water quality index is employed alongside machine learning models, specifically RF, LightGBM, and XGBoost, to predict water quality. The models' performance was evaluated using metrics such as MAE, RMSE, R-2, and R. The results demonstrated high predictive accuracy, with XGBoost showing the best performance (R-2 = 0.992, R = 0.996, MAE = 0.825, and RMSE = 1.381). Additionally, SHAP were used to interpret the model's predictions, revealing that COD and BOD are the most influential factors in determining water quality, while electrical conductivity, chloride, and nitrate had minimal impact. High dissolved oxygen levels were associated with lower water quality index, indicative of excellent water quality, while pH consistently influenced predictions. The findings suggest that the proposed approach offers a reliable and interpretable method for water quality prediction, which can significantly benefit water specialists and decision-makers.
引用
收藏
页数:14
相关论文
共 52 条
[41]   Modelling of impact of water quality on recharging rate of storm water filter system using various kernel function based regression [J].
Sihag P. ;
Jain P. ;
Kumar M. .
Modeling Earth Systems and Environment, 2018, 4 (1) :61-68
[42]   A Simplified Equation for Calculating the Water Quality Index (WQI), Kalu River, Sri Lanka [J].
Siriwardhana, Kushan D. ;
Jayaneththi, Dimantha I. ;
Herath, Ruchiru D. ;
Makumbura, Randika K. ;
Jayasinghe, Hemantha ;
Gunathilake, Miyuru B. ;
Azamathulla, Hazi Md. ;
Tota-Maharaj, Kiran ;
Rathnayake, Upaka .
SUSTAINABILITY, 2023, 15 (15)
[43]   Explaining prediction models and individual predictions with feature contributions [J].
Strumbelj, Erik ;
Kononenko, Igor .
KNOWLEDGE AND INFORMATION SYSTEMS, 2014, 41 (03) :647-665
[44]   How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions [J].
Sun, Alexander Y. ;
Scanlon, Bridget R. .
ENVIRONMENTAL RESEARCH LETTERS, 2019, 14 (07)
[45]   A novel explainable AI-based approach to estimate the natural period of vibration of masonry infill reinforced concrete frame structures using different machine learning techniques [J].
Thisovithan, P. ;
Aththanayake, Harinda ;
Meddage, D. P. P. ;
Ekanayake, I. U. ;
Rathnayake, Upaka .
RESULTS IN ENGINEERING, 2023, 19
[46]   A SCATTER PLOT FOR IDENTIFYING STIMULUS-CONTROL OF PROBLEM BEHAVIOR [J].
TOUCHETTE, PE ;
MACDONALD, RF ;
LANGER, SN .
JOURNAL OF APPLIED BEHAVIOR ANALYSIS, 1985, 18 (04) :343-351
[47]   Spatial heterogeneity modeling of water quality based on random forest regression and model interpretation [J].
Wang, Feier ;
Wang, Yixu ;
Zhang, Kai ;
Hu, Ming ;
Weng, Qin ;
Zhang, Huichun .
ENVIRONMENTAL RESEARCH, 2021, 202
[48]   Prediction of estuarine water quality using interpretable machine learning approach [J].
Wang, Shuo ;
Peng, Hui ;
Liang, Shengkang .
JOURNAL OF HYDROLOGY, 2022, 605
[49]   A study of predicting irradiation-induced transition temperature shift for RPV steels with XGBoost modeling [J].
Xu, Chaoliang ;
Liu, Xiangbing ;
Wang, Hongke ;
Li, Yuanfei ;
Jia, Wenqing ;
Qian, Wangjie ;
Quan, Qiwei ;
Zhang, Huajian ;
Xue, Fei .
NUCLEAR ENGINEERING AND TECHNOLOGY, 2021, 53 (08) :2610-2615
[50]   Influence of ammonium nitrogen on the treatment efficiency of underground water at iron removal stations [J].
Yushchenko, Viktor ;
Velyugo, Elena ;
Romanovski, Valentin .
GROUNDWATER FOR SUSTAINABLE DEVELOPMENT, 2023, 22