Prediction of water quality indexes with ensemble learners: Bagging and boosting

被引:59
作者
Aldrees, Ali [1 ]
Awan, Hamad Hassan [2 ]
Javed, Muhammad Faisal [3 ]
Mohamed, Abdeliazim Mustafa [1 ]
机构
[1] Prince Sattam bin Abdulaziz Univ, Coll Engn Al Kharj, Dept civil Engn, Al Kharj 11942, Saudi Arabia
[2] Natl Univ Sci & Technol, Natl Inst Transportat SCEE, Islamabad 44000, Pakistan
[3] COMSATS Univ Islamabad, Dept Civil Engn, Abbottabad Campus, Abbottabad 22060, Pakistan
关键词
Total dissolved solids; Electrical conductivity; BACKPROPAGATION NEURAL-NETWORK; RIVER FLOW; MODELS; AREA;
D O I
10.1016/j.psep.2022.10.005
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
One of the most crucial jobs to improve water resources management plans is the assessment of river water quality. A water quality index (WQI) takes multiple water quality factors into account simultaneously. Tradi-tionally, derivations of sub-indices for WQI computations take a long time and are frequently rife with errors. The adoption of reliable and effective machine learning (ML) algorithms has become essential for predicting the WQI of such a matrix. This study predicts WQI, i.e., total dissolved solids (TDS) and electrical conductivity (EC), using ML techniques, including individual learners in conjunction with ensemble learners (bagging and boosting). Anaconda (Python) is utilized to accomplish this. Weak ensemble learners are incorporated to create a strong ensemble learner using an adaptive boosting technique, ensemble learner bagging, and random forest (RF) as a modified bagging method. The ensemble learners are employed on weak or individual learners, which include multi-layer perceptron neural networks (MLPNN), support vector machines (SVM), and decision trees (DT) using regression. The data comprises 372 data readings collected on a monthly basis and eight characteristics to forecast the results. Twenty boosting and bagging sub-models were trained on the collected data readings, and they were then optimized to produce the highest R2. Additionally, K-Fold cross-validation with R2, RMSE, and MAE is used to validate the testing data. Furthermore, a statistical model performance index is used to compare ensemble models to individual ones (e.g., MAE, RMSE, NSE, MSE, and RMLSE). The outcome revealed that using the boosting and bagging learners improves the response of individual models. RF, with an R2 of 0.958 and 0.964 (TDS and EC), and DT, with bagging having an R2 of 0.954 and 0.961 (TDS and EC), reported the fewest errors and provided the most reliable and precise performance of the models. In general, the ML ensemble model would improve the performance of models.
引用
收藏
页码:344 / 361
页数:18
相关论文
共 80 条
[31]   Prediction of surface water total dissolved solids using hybridized wavelet-multigene genetic programming: New approach [J].
Jamei, Mehdi ;
Ahmadianfar, Iman ;
Chu, Xuefeng ;
Yaseen, Zaher Mundher .
JOURNAL OF HYDROLOGY, 2020, 589
[32]   A review of the application of constructed wetlands (CWs) and their hydraulic, water quality and biological responses to changing hydrological conditions [J].
Jiang, Long ;
Chui, Ting Fong May .
ECOLOGICAL ENGINEERING, 2022, 174
[33]   Estimating longitudinal dispersion coefficient in natural streams using empirical models and machine learning algorithms [J].
Kargar, Katayoun ;
Samadianfard, Saeed ;
Parsa, Javad ;
Nabipour, Narjes ;
Shamshirband, Shahaboddin ;
Mosavi, Amir ;
Chau, Kwok-wing .
ENGINEERING APPLICATIONS OF COMPUTATIONAL FLUID MECHANICS, 2020, 14 (01) :311-322
[34]   Application of random forest for modelling of surface water salinity [J].
Khan, Mohsin Ali ;
Shah, M. Izhar ;
Javed, Muhammad Faisal ;
Khan, M. Ijaz ;
Rasheed, Saim ;
El-Shorbagy, M. A. ;
El-Zahar, Essam Roshdy ;
Malik, M. Y. .
AIN SHAMS ENGINEERING JOURNAL, 2022, 13 (04)
[35]   Simulation of Depth of Wear of Eco-Friendly Concrete Using Machine Learning Based Computational Approaches [J].
Khan, Mohsin Ali ;
Farooq, Furqan ;
Javed, Mohammad Faisal ;
Zafar, Adeel ;
Ostrowski, Krzysztof Adam ;
Aslam, Fahid ;
Malazdrewicz, Seweryn ;
Maslak, Mariusz .
MATERIALS, 2022, 15 (01)
[36]   Predicting the Ultimate Axial Capacity of Uniaxially Loaded CFST Columns Using Multiphysics Artificial Intelligence [J].
Khan, Sangeen ;
Ali Khan, Mohsin ;
Zafar, Adeel ;
Javed, Muhammad Faisal ;
Aslam, Fahid ;
Musarat, Muhammad Ali ;
Vatin, Nikolai Ivanovich .
MATERIALS, 2022, 15 (01)
[37]   Effects of Irrigation with Saline Water on Crop Growth and Yield in Greenhouse Cultivation [J].
Kim, Hakkwan ;
Jeong, Hanseok ;
Jeon, Jihye ;
Bae, Seungjong .
WATER, 2016, 8 (04)
[38]   Augmentation of limited input data using an artificial neural network method to improve the accuracy of water quality modeling in a large lake [J].
Kim, Jaeyoung ;
Seo, Dongil ;
Jang, Miyoung ;
Kim, Jiyong .
JOURNAL OF HYDROLOGY, 2021, 602
[39]   High temporal resolution prediction of street-level PM2.5 and NOx concentrations using machine learning approach [J].
Li, Zhiyuan ;
Yim, Steve Hung-Lam ;
Ho, Kin-Fai .
JOURNAL OF CLEANER PRODUCTION, 2020, 268
[40]   Modeling of Arsenic (III) Removal by Evolutionary Genetic Programming and Least Square Support Vector Machine Models [J].
Mandal S. ;
Mahapatra S.S. ;
Adhikari S. ;
Patel R.K. .
Environmental Processes, 2015, 2 (01) :145-172