Water quality index modeling using random forest and improved SMO algorithm for support vector machine in Saf-Saf river basin

被引:0
作者
Bachir Sakaa
Ahmed Elbeltagi
Samir Boudibi
Hicham Chaffaï
Abu Reza Md. Towfiqul Islam
Luc Cimusa Kulimushi
Pandurang Choudhari
Azzedine Hani
Youssef Brouziyne
Yong Jie Wong
机构
[1] Scientific and Technical Research Center on Arid Regions (CRSTRA),Faculty of Earth Sciences, Laboratory of Water Resource and Sustainable Development (REED)
[2] Badji Mokhtar University,Agricultural Engineering Dept, Faculty of Agriculture
[3] Mansoura University,Department of Disaster Management
[4] Begum Rokeya University,Department of Environmental Studies
[5] University of Lay Adventists of Kigali,Department of Geography
[6] University of Mumbai,International Water Research Institute
[7] Mohammed VI Polytechnic University (UM6P),Research Center for Environmental Quality Management, Graduate School of Engineering
[8] Kyoto University,undefined
来源
Environmental Science and Pollution Research | 2022年 / 29卷
关键词
Water quality; Random forest; Sequential minimal optimization; Improved support vector machine; Sensitivity analysis;
D O I
暂无
中图分类号
学科分类号
摘要
The water quality index is one of the prominent general indicators to assess and classify surface water quality, which plays a critical role in river water resources practices. This research constructs a hybrid artificial intelligence model namely sequential minimal optimization-support vector machine (SMO-SVM) along with random forest (RF) as a benchmark model for predicting water quality values at the Wadi Saf-Saf river basin in Algeria. The fifteen input water quality datasets such as biochemical oxygen demand (BOD), oxygen saturation (OS), the potential for hydrogen (pH), chemical oxygen demand (COD), chloride (Cl−), dissolved oxygen (DO), electrical conductivity (EC), total dissolved solids (TDS), nitrate-nitrogen (NO3-N), nitrite-nitrogen (NO2-N), phosphate (PO43−), ammonium (NH4+), temperature (T), turbidity (NTU), and suspended solids (SS) were employed for constructing the predictive models. Different input data combinations are evaluated in terms of predictive performance, using a set of statistical metrics and graphical representation. Results show that less than 40% of samples were observed to be poor quality water during the dry season in downstream northeastern part of the basin. The findings also show that the RF model mostly generates more precise water quality index predictions than the SMO-SVM model for both training and testing stages. Although thirteen input parameters attain the optimal predictive performance (R2 testing = 0.82, RMSE testing = 5.17), a couple of five input parameters, e.g., only pH, EC, TDS, T, and saturation, gives the second optimal predictive precision (R2 test = 0.81, RMSE testing = 5.55). The sensitivity analysis results indicate a greater sensitivity by the all input variables chosen except NO2− of the predictive outcomes to the earlier influencing water quality parameters. Overall, the RF model reveals an improvement on earlier tools for predicting water quality index, according to predictive performance and reducing in the number of input variables.
引用
收藏
页码:48491 / 48508
页数:17
相关论文
共 112 条
[1]  
Alizadeh MJ(2015)Development of wavelet-ANN models to predict water quality parameters in Hilo Bay Pacific Ocean Mar Pollut Bull 98 171-178
[2]  
Kavianpour MR(2010)Utilization of the water quality index method as a classification tool Environ Monit Assess 167 115-124
[3]  
Boyacioglu H(2001)Random forests Mach Learn 45 5-32
[4]  
Breiman L(2020)Improving prediction of water quality indices using novel hybrid machine-learning algorithms Sci Total Environ 64 2409-2420
[5]  
Bui DT(2012)Artificial neural network modeling of the water quality index for Kinta River (Malaysia) using water quality variables as predictors Mar Pollut Bull 87 99-112
[6]  
Khosravi K(2015)Artificial neural network modeling of the water quality index using land use areas as predictors Water Environ Res 27 34322-34336
[7]  
Tiefenbacher J(2020)Biochemical oxygen demand prediction: development of hybrid wavelet-random forest and M5 model tree approach using feature selection algorithms Environ Sci Pollut Res Int 28 893-905
[8]  
Gazzaz NM(2017)Application of artificial intelligence (AI) techniques in water quality index prediction: a case study in tropical region, Malaysia Neural Comput Appl 137 273-283
[9]  
Yusoff MK(2011)Development of water quality indexes to identify pollutants in Vietnam’s surface water J Environ Eng 13 6832-26374
[10]  
Aris AZ(2021)Health risk and water quality assessment of surface water in an urban river of Bangladesh Sustainability 24 26350-116