Development of an Extreme Gradient Boosting Model Integrated With Evolutionary Algorithms for Hourly Water Level Prediction

被引:34
作者
Nguyen, Duc Hai [1 ,2 ]
Hien Le, Xuan [2 ]
Heo, Jae-Yeong [1 ]
Bae, Deg-Hyo [1 ]
机构
[1] Sejong Univ, Dept Civil & Environm Engn, Seoul 143747, South Korea
[2] Thuyloi Univ, Fac Water Resources Engn, Hanoi 116705, Vietnam
来源
IEEE ACCESS | 2021年 / 9卷
关键词
Predictive models; Radio frequency; Floods; Machine learning algorithms; Genetic algorithms; Urban areas; Prediction algorithms; Extreme gradient boosting; evolutionary algorithms; water level prediction; tree-based model; urban floods; DIFFERENTIAL-EVOLUTION; GENETIC ALGORITHMS; NEURAL-NETWORK; OPTIMIZATION; WAVELET; ANFIS; CAPACITY;
D O I
10.1109/ACCESS.2021.3111287
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The establishment of reliable water level prediction models is vital for urban flood control and planning. In this paper, we develop hybrid models (GA-XGBoost and DE-XGBoost) that couple two evolutionary models, a genetic algorithm (GA) and a differential evolution (DE) algorithm, with the extreme gradient boosting (XGBoost) model for hourly water level prediction. The Jungrang urban basin located on the Han River, South Korea, was selected as a case study for the proposed models. Hourly rainfall and water level data were collected between 2003 and 2020 to construct and evaluate the performance of the selected models. To compare the prediction efficiency, two other tree-based models were chosen: classification and registration tree (CART) and random forest (RF) models. A comparison of the results showed that two hybrid models, GA-XGBoost and DE-XGBoost, outperformed RF and CART in the multistep-ahead prediction of water level, and the relative errors of the hybrid model ranged from [2.18%-9.21%], compared to [3.76%-10.41%] and [2.99%-11.88%] for the RF and CART, respectively. Reliable performance was also supported by other measures. In general, the GA-XGBoost and DE-XGBoost models displayed relatively similar performance despite their small differences. The CART model was not preferable for multistep-ahead water level predictions, even though it yielded the lowest Akaike information criterion (AIC) value. This study verifies that despite having some drawbacks when considering long step-ahead prediction and model complexity, hybrid XGBoost models might be superior to many existing models for hourly water level prediction.
引用
收藏
页码:125853 / 125867
页数:15
相关论文
共 60 条
  • [1] A hybrid of Random Forest and Deep Auto-Encoder with support vector regression methods for accuracy improvement and uncertainty reduction of long-term streamflow prediction
    Abbasi, Mahdi
    Farokhnia, Ashkan
    Bahreinimotlagh, Masoud
    Roozbahani, Reza
    [J]. JOURNAL OF HYDROLOGY, 2021, 597
  • [2] Daily streamflow prediction using optimally pruned extreme learning machine
    Adnan, Rana Muhammad
    Liang, Zhongmin
    Trajkovic, Slavisa
    Zounemat-Kermani, Mohammad
    Li, Binquan
    Kisi, Ozgur
    [J]. JOURNAL OF HYDROLOGY, 2019, 577
  • [3] Flood projections within the Niger River Basin under future land use and climate change
    Aich, Valentin
    Liersch, Stefan
    Vetter, Tobias
    Fournet, Samuel
    Andersson, Jafet C. M.
    Calmanti, Sandro
    van Weert, Frank H. A.
    Hattermann, Fred F.
    Paton, Eva N.
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2016, 562 : 666 - 677
  • [4] Development of multivariate adaptive regression spline integrated with differential evolution model for streamflow simulation
    Al-Sudani, Zainab Abdulelah
    Salih, Sinan Q.
    Sharafati, Ahmad
    Yaseen, Zaher Mundher
    [J]. JOURNAL OF HYDROLOGY, 2019, 573 : 1 - 12
  • [5] Atkinson B., 2019, **DATA OBJECT**, P1
  • [6] Evolutionary Pareto optimization of an ANFIS network for modeling scour at pile groups in clear water condition
    Azimi, Hamed
    Bonakdari, Hossein
    Ebtehaj, Isa
    Talesh, Seyed Hamed Ashraf
    Michelson, David G.
    Jamali, Ali
    [J]. FUZZY SETS AND SYSTEMS, 2017, 319 : 50 - 69
  • [7] New insights into soil temperature time series modeling: linear or nonlinear?
    Bonakdari, Hossein
    Moeeni, Hamid
    Ebtehaj, Isa
    Zeynoddin, Mohammad
    Mahoammadian, Abdolmajid
    Gharabaghi, Bahram
    [J]. THEORETICAL AND APPLIED CLIMATOLOGY, 2019, 135 (3-4) : 1157 - 1177
  • [8] Breiman L., 2018, randomForest: Breiman and Cutler's random forests for classification and regression
  • [9] Breiman L., 1984, STAT PROBABILITY SER, DOI 10.1201/9781315139470
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32