Low-flow estimation beyond the mean - expectile loss and extreme gradient boosting for spatiotemporal low-flow prediction in Austria

被引:7
作者
Laimighofer, Johannes [1 ]
Melcher, Michael [2 ]
Laaha, Gregor [1 ]
机构
[1] Univ Nat Resources & Life Sci, Inst Stat, Dept Landscape Spatial & Infrastruct Sci, Peter Jordan Str 82-1, A-1190 Vienna, Austria
[2] FH JOANNEUM Univ Appl Sci, Inst Informat Management, Graz, Austria
关键词
UNGAUGED BASINS; MODELS; RUNOFF; INDEXES; UNCERTAINTY; SIMULATION; CATCHMENTS; REGRESSION; HYDROLOGY; ENSEMBLE;
D O I
10.5194/hess-26-4553-2022
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XG-Boost) for predicting monthly low flow in ungauged catchments. Particular emphasis is placed on the lowest values (in the magnitude of annual low flows and below) by implementing the expectile loss function to the XG-Boost model. For this purpose, we test expectile loss functions based on decreasing expectiles (from tau = 0.5 to 0.01) that give increasing weight to lower values. These are compared to common loss functions such as mean and median absolute loss. Model optimization and evaluation are conducted using a nested cross-validation (CV) approach that includes recursive feature elimination (RFE) to promote parsimonious models. The methods are tested on a comprehensive dataset of 260 stream gauges in Austria, covering a wide range of low-flow regimes. Our results demonstrate that the expectile loss function can yield high prediction accuracy, but the performance drops sharply for low expectile models. With a median R-2 of 0.67, the 0.5 expectile yields the best-performing model. The 0.3 and 0.2 perform slightly worse, but still outperform the common median and mean absolute loss functions. All expectile models include some stations with moderate and poor performance that can be attributed to some systematic error, while the seasonal and annual variability is well covered by the models. Results for the prediction of low extremes show an increasing performance in terms of R-2 for smaller expectiles (0.01, 0.025, 0.05), though leading to the disadvantage of classifying too many extremes for each station. We found that the application of different expectiles leads to a trade-off between overall performance, prediction performance for extremes, and misclassification of extreme low-flow events. Our results show that the 0.1 or 0.2 expectiles perform best with respect to all three criteria. The resulting extreme gradient tree boosting model covers seasonal and annual variability nicely and provides a viable approach for spatiotemporal modeling of a range of hydrological variables representing average conditions and extreme events.
引用
收藏
页码:4553 / 4574
页数:22
相关论文
共 77 条
[11]   Regional models for the estimation of streamflow series in ungauged basins [J].
Cutore, Pasquale ;
Cristaudo, Gabriella ;
Campisano, Alberto ;
Modica, Carlo ;
Cancelliere, Antonino ;
Rossi, Giuseppe .
WATER RESOURCES MANAGEMENT, 2007, 21 (05) :789-800
[12]   Hydrological modelling using artificial neural networks [J].
Dawson, CW ;
Wilby, RL .
PROGRESS IN PHYSICAL GEOGRAPHY-EARTH AND ENVIRONMENT, 2001, 25 (01) :80-108
[13]   A framework to assess the realism of model structures using hydrological signatures [J].
Euser, T. ;
Winsemius, H. C. ;
Hrachowitz, M. ;
Fenicia, F. ;
Uhlenbrook, S. ;
Savenije, H. H. G. .
HYDROLOGY AND EARTH SYSTEM SCIENCES, 2013, 17 (05) :1893-1912
[14]   Machine learning models for streamflow regionalization in a tropical watershed [J].
Ferreira, Renan Gon ;
da Silva, Demetrius David ;
Elesbon, Abrahao Alexandre Alden ;
Fernandes-Filho, Elpidio Inacio ;
Veloso, Gustavo Vieira ;
Fraga, Micael de Souza ;
Ferreira, Lucas Borges .
JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2021, 280 (280)
[15]   Additive logistic regression: A statistical view of boosting - Rejoinder [J].
Friedman, J ;
Hastie, T ;
Tibshirani, R .
ANNALS OF STATISTICS, 2000, 28 (02) :400-407
[16]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22
[17]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[18]   Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products [J].
Granitto, Pablo M. ;
Furlanello, Cesare ;
Biasioli, Franco ;
Gasperi, Flavia .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2006, 83 (02) :83-90
[19]  
Grolemond G, 2011, J STAT SOFTW, V40, P1
[20]  
Henry Lionel, 2024, CRAN