Prediction of water quality indexes with ensemble learners: Bagging and boosting

被引:58
作者
Aldrees, Ali [1 ]
Awan, Hamad Hassan [2 ]
Javed, Muhammad Faisal [3 ]
Mohamed, Abdeliazim Mustafa [1 ]
机构
[1] Prince Sattam bin Abdulaziz Univ, Coll Engn Al Kharj, Dept civil Engn, Al Kharj 11942, Saudi Arabia
[2] Natl Univ Sci & Technol, Natl Inst Transportat SCEE, Islamabad 44000, Pakistan
[3] COMSATS Univ Islamabad, Dept Civil Engn, Abbottabad Campus, Abbottabad 22060, Pakistan
关键词
Total dissolved solids; Electrical conductivity; BACKPROPAGATION NEURAL-NETWORK; RIVER FLOW; MODELS; AREA;
D O I
10.1016/j.psep.2022.10.005
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
One of the most crucial jobs to improve water resources management plans is the assessment of river water quality. A water quality index (WQI) takes multiple water quality factors into account simultaneously. Tradi-tionally, derivations of sub-indices for WQI computations take a long time and are frequently rife with errors. The adoption of reliable and effective machine learning (ML) algorithms has become essential for predicting the WQI of such a matrix. This study predicts WQI, i.e., total dissolved solids (TDS) and electrical conductivity (EC), using ML techniques, including individual learners in conjunction with ensemble learners (bagging and boosting). Anaconda (Python) is utilized to accomplish this. Weak ensemble learners are incorporated to create a strong ensemble learner using an adaptive boosting technique, ensemble learner bagging, and random forest (RF) as a modified bagging method. The ensemble learners are employed on weak or individual learners, which include multi-layer perceptron neural networks (MLPNN), support vector machines (SVM), and decision trees (DT) using regression. The data comprises 372 data readings collected on a monthly basis and eight characteristics to forecast the results. Twenty boosting and bagging sub-models were trained on the collected data readings, and they were then optimized to produce the highest R2. Additionally, K-Fold cross-validation with R2, RMSE, and MAE is used to validate the testing data. Furthermore, a statistical model performance index is used to compare ensemble models to individual ones (e.g., MAE, RMSE, NSE, MSE, and RMLSE). The outcome revealed that using the boosting and bagging learners improves the response of individual models. RF, with an R2 of 0.958 and 0.964 (TDS and EC), and DT, with bagging having an R2 of 0.954 and 0.961 (TDS and EC), reported the fewest errors and provided the most reliable and precise performance of the models. In general, the ML ensemble model would improve the performance of models.
引用
收藏
页码:344 / 361
页数:18
相关论文
共 80 条
[1]   ANN-derived equation and ITS application in the prediction of dielectric properties of pure and impure CO2 [J].
Abidoye, L. K. ;
Mahdi, F. M. ;
Idris, M. O. ;
Alabi, O. O. ;
Wahab, A. A. .
JOURNAL OF CLEANER PRODUCTION, 2018, 175 :123-132
[2]   An ensemble multi-step-ahead forecasting system for fine particulate matter in urban areas [J].
Ahani, Ida Kalate ;
Salari, Majid ;
Shadman, Alireza .
JOURNAL OF CLEANER PRODUCTION, 2020, 263
[3]   A comprehensive study of basalt fiber reinforced magnesium phosphate cement incorporating ultrafine fly ash [J].
Ahmad, Muhammad Riaz ;
Chen, Bing ;
Yu, Jiang .
COMPOSITES PART B-ENGINEERING, 2019, 168 :204-217
[4]   Efficient Water Quality Prediction Using Supervised Machine Learning [J].
Ahmed, Umair ;
Mumtaz, Rafia ;
Anwar, Hirra ;
Shah, Asad A. ;
Irfan, Rabia ;
Garcia-Nieto, Jose .
WATER, 2019, 11 (11)
[5]   Modeling Water Quality Parameters Using Data-Driven Models, a Case Study Abu-Ziriq Marsh in South of Iraq [J].
Al-Mukhtar, Mustafa ;
Al-Yaseen, Fuaad .
HYDROLOGY, 2019, 6 (01)
[6]   Linking DPSIR Model and Water Quality Indices to Achieve Sustainable Development Goals in Groundwater Resources [J].
Alexakis, Dimitrios E. .
HYDROLOGY, 2021, 8 (02)
[7]   Meta-Evaluation of Water Quality Indices. Application into Groundwater Resources [J].
Alexakis, Dimitrios E. .
WATER, 2020, 12 (07)
[8]   Effect of river flow on the quality of estuarine and coastal waters using machine learning models [J].
Alizadeh, Mohamad Javad ;
Kavianpour, Mohamad Reza ;
Danesh, Malihe ;
Adolf, Jason ;
Shamshirband, Shahabbodin ;
Chau, Kwok-Wing .
ENGINEERING APPLICATIONS OF COMPUTATIONAL FLUID MECHANICS, 2018, 12 (01) :810-823
[9]  
[Anonymous], 2011, Pei. Data Mining Concepts and Techniques: The Morgan Kaufmann Series in Data Management Systems
[10]  
[Anonymous], 1995, P IJCAI