Modeling the organic matter of water using the decision tree coupled with bootstrap aggregated and least-squares boosting

被引:55
作者
Tahraoui, Hichem [1 ]
Amrane, Abdeltif [2 ]
Belhadj, Abd-Elmouneim [1 ]
Zhang, Jie [3 ]
机构
[1] Medea Univ, Univ MEDEA, Lab Biomat & Transport Phenomena LBMPT, Nouveau Pole Urbain, Medea 26000, Algeria
[2] Univ Rennes, Ecole Natl Super Chim Rennes, CNRS, ISCR UMR6226, F-35000 Rennes, France
[3] Newcastle Univ, Sch Engn, Merz Court, Newcastle Upon Tyne NE1 7RU, England
关键词
Water; Physic-chemical parameters; Organic matter; Modeling; Decision tree; Bootstrap aggregates; Least-squares boosting; ARTIFICIAL NEURAL-NETWORKS; INFORMATION GAIN; DRINKING-WATER; SELECTION; COAGULATION; PREDICTION; MODEL; RIVER; ALGORITHMS; REMOVAL;
D O I
10.1016/j.eti.2022.102419
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The purpose of the work is to investigate the use of decision tree (DT) enhanced by bootstrap aggregates (Bag) and least-squares boosting (Lsboost) in modeling the organic matter of water according to its physicochemical parameters. An entire database of 500 samples of 21 physicochemical parameters, including organic matter, was used to build the DT, DT_Bag, and DT_Lsboost models. Training data (364 data points) is resampled using a bootstrap technique to form different training datasets to train different models. The models built were validated by a dataset of 91 samples. The predicted outputs obtained from the developed DT models are then combined by simple averaging. On the other hand, the data was also boosted with the Lsboost technical aid to increase the strength of a weak learning algorithm. The model trains the first weak learner with equal weight across all data points in the training set, then trains all other weak learners based on the updated weight aimed at the validation result to minimize the squared error medium. Good agreement between the predicted and experimental organic matter concentrations for the DT_Lsboost model was obtained (the correlation coefficient for the validation dataset was 0.9992), followed by the DT-Bag model with a correlation coefficient of 0.9949. The comparison between DT, DT_Bag, and DT_Lsboost revealed the superiority of the DT_Lsboost model (the mean root of the squared errors for the dataset were 0.1295 for the DT_Lsboost, 0.1664 for the DT_Bag, and 0.5444 for the DT). These results show that Lsboost technology dramatically improved the DT model. This result is also confirmed by the results of tests on models (interpolation data of 45 points). It should also be noted that the Bag technique was also very effective in optimizing the DT model, as the results obtained with this technique were very close to the DT_Lsboost model. (C)& nbsp;2022 The Authors. Published by Elsevier B.V.& nbsp;
引用
收藏
页数:19
相关论文
共 69 条
[1]   A computational intelligence scheme for prediction equilibrium water dew point of natural gas in TEG dehydration systems [J].
Ahmadi, Mohammad Ali ;
Soleimani, Reza ;
Bahadori, Alireza .
FUEL, 2014, 137 :145-154
[2]   A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs [J].
Al-Anazi, A. ;
Gates, I. D. .
ENGINEERING GEOLOGY, 2010, 114 (3-4) :267-277
[3]   Least Squares Boosting Ensemble and Quantum-Behaved Particle Swarm Optimization for Predicting the Surface Roughness in Face Milling Process of Aluminum Material [J].
Alajmi, Mahdi S. ;
Almeshal, Abdullah M. .
APPLIED SCIENCES-BASEL, 2021, 11 (05) :1-13
[4]   Coagulation-flocculation process with ultrafiltered saline extract of Moringa oleifera for the treatment of surface water [J].
Alves Baptista, Aline Takaoka ;
Coldebella, Priscila Ferri ;
Freitas Cardines, Pedro Henrique ;
Gomes, Raquel Guttierres ;
Vieira, Marcelo Fernandes ;
Bergamasco, Rosangela ;
Salcedo Vieira, Angelica Marquetotti .
CHEMICAL ENGINEERING JOURNAL, 2015, 276 :166-173
[5]  
[Anonymous], 1990, SCI SOL
[6]  
[Anonymous], 1983, CLASSIFICATION REGRE
[7]  
[Anonymous], MACH LEARN
[8]   Network and station-level bike-sharing system prediction: a San Francisco bay area case study [J].
Ashqar, Huthaifa I. ;
Elhenawy, Mohammed ;
Rakha, Hesham A. ;
Almannaa, Mohammed ;
House, Leanna .
JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 26 (05) :602-612
[9]  
Badaoui H.El., 2012, J HYDROCARBONS MINES, V3, P31
[10]  
Barutçuoglu Z, 2003, LECT NOTES COMPUT SC, V2714, P76