Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model

被引:175
作者
Ni, Lingling [1 ]
Wang, Dong [1 ]
Wu, Jianfeng [1 ]
Wang, Yuankun [1 ]
Tao, Yuwei [1 ]
Zhang, Jianyun [2 ]
Liu, Jiufu [2 ]
机构
[1] Nanjing Univ, Sch Earth Sci & Engn, Dept Hydrosci, Key Lab Surficial Geochem,Minist Educ, Nanjing 210023, Peoples R China
[2] Nanjing Hydraul Res Inst, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
Streamflow forecasting; Extreme gradient boosting; Gaussian mixture model; Modular models; SHORT-TERM-MEMORY; GLOBAL SOLAR-RADIATION; WATER-RESOURCES; PREDICTION; DECOMPOSITION; MACHINE; PRECIPITATION; REGRESSION; MULTIMODEL; SKILL;
D O I
10.1016/j.jhydrol.2020.124901
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
The establishment of an accurate and reliable forecasting model is important for water resource planning and management. In this study, we developed a hybrid model (namely GMM-XGBoost), coupling extreme gradient boosting (XGBoost) with Gaussian mixture model (GMM), for monthly streamflow forecasting. The proposed model is based on the principle of modular model, where a complex problem is divided into several simple ones. GMM was applied to cluster streamflow into several groups, using the features selected by a tree-based method. Then, each group was used to fit several single XGBoosts. And the prediction is a weighted average of the single models. Monthly streamflow data at Cuntan and Hankou stations on Yangtze River Basin were used to evaluate the performance of the proposed model. To compare the forecasting efficiency, support vector machine (SVM) and standalone XGBoost were selected as the benchmark models. The results indicated that although all three models yielded quite good performance on one-month ahead forecasting with high Nash-Sutclitte efficiency coefficient (NSE) and low root mean squared error (RMSE), GMM-XGBoost provided the best accuracy with significant improvement of forecasting accuracy. It can be inferred from the results that (1) XGBoost is applicable for streamflow forecasting, and in general, performs better than SVM; (2) the cluster analysis-based modular model is helpful in improving accuracy and capturing the complicated patterns of hydrological process; (3) the proposed GMM-XGBoost model is a superior alternative, which can provide accurate and reliable predictions for optimal water resources management.
引用
收藏
页数:11
相关论文
共 55 条
[1]   Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada [J].
Adamowski, Jan ;
Chan, Hiu Fung ;
Prasher, Shiv O. ;
Ozga-Zielinski, Bogdan ;
Sliusarieva, Anna .
WATER RESOURCES RESEARCH, 2012, 48
[2]   Markov chain-incorporated and synthetic data-supported conditional artificial neural network models for forecasting monthly precipitation in arid regions [J].
Aksoy, Hafzullah ;
Dahamsheh, Ahmad .
JOURNAL OF HYDROLOGY, 2018, 562 :758-779
[3]   APPLICATION OF LINEAR RANDOM MODELS TO 4 ANNUAL STREAMFLOW SERIES [J].
CARLSON, RF ;
MACCORMICK, AJ ;
WATTS, DG .
WATER RESOURCES RESEARCH, 1970, 6 (04) :1070-+
[4]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[5]   EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction [J].
Chen, Xing ;
Huang, Li ;
Xie, Di ;
Zhao, Qi .
CELL DEATH & DISEASE, 2018, 9
[6]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[7]   Bias correcting precipitation forecasts to improve the skill of seasonal streamflow forecasts [J].
Crochemore, Louise ;
Ramos, Maria-Helena ;
Pappenberger, Florian .
HYDROLOGY AND EARTH SYSTEM SCIENCES, 2016, 20 (09) :3601-3618
[8]   Streamflow prediction using linear genetic programming in comparison with a neuro-wavelet technique [J].
Danandeh Mehr, Ali ;
Kahya, Ercan ;
Olyaie, Ehsan .
JOURNAL OF HYDROLOGY, 2013, 505 :240-249
[9]   Interpretable machine learning for predicting biomethane production in industrial-scale anaerobic co-digestion [J].
De Clercq, Djavan ;
Wen, Zongguo ;
Fei, Fan ;
Caicedo, Luis ;
Yuan, Kai ;
Shang, Ruoxi .
SCIENCE OF THE TOTAL ENVIRONMENT, 2020, 712
[10]   Novel forecasting models for immediate-short-term to long-term influent flow prediction by combining ANFIS and grey wolf optimization [J].
Dehghani, Majid ;
Seifi, Akram ;
Riahi-Madvar, Hossien .
JOURNAL OF HYDROLOGY, 2019, 576 :698-725