Regional hydrological frequency analysis at ungauged sites with random forest regression

被引:109
作者
Desai, Shitanshu [1 ]
Ouarda, Taha B. M. J. [1 ]
机构
[1] Ctr Eau Terre Environm, Inst Natl Rech Sci, INRS ETE, Canada Res Chair Stat Hydroclimatol, 490 La Couronne, Quebec City, PQ G1K 9A9, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Random Forest Regression; Canonical correlation analysis; Regional flood frequency analysis; Ungauged basin; Machine learning; Regional estimation; ARTIFICIAL NEURAL-NETWORKS; MACHINE; MODEL;
D O I
10.1016/j.jhydrol.2020.125861
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Flood quantile estimation at sites with little or no data is important for the adequate planning and management of water resources. Regional Hydrological Frequency Analysis (RFA) deals with the estimation of hydrological variables at ungauged sites. Random Forest (RF) is an ensemble learning technique which uses multiple Classification and Regression Trees (CART) for classification, regression, and other tasks. The RF technique is gaining popularity in a number of fields because of its powerful non-linear and non-parametric nature. In the present study, we investigate the use of Random Forest Regression (RFR) in the estimation step of RFA based on a case study represented by data collected from 151 hydrometric stations from the province of Quebec, Canada. RFR is applied to the whole data set and to homogeneous regions of stations delineated by canonical correlation analysis (CCA). Using the Out-of-bag error rate feature of RF, the optimal number of trees for the dataset is calculated. The results of the application of the CCA based RFR model (CCA-RFR) are compared to results obtained with a number of other linear and non-linear RFA models. CCA-RFR leads to the best performance in terms of root mean squared error. The use of CCA to delineate neighborhoods improves considerably the performance of RFR. RFR is found to be simple to apply and more efficient than more complex models such as Artificial Neural Networkbased models.
引用
收藏
页数:8
相关论文
共 45 条
  • [1] Application of artificial neural networks in regional flood frequency analysis: a case study for Australia
    Aziz, K.
    Rahman, A.
    Fang, G.
    Shrestha, S.
    [J]. STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2014, 28 (03) : 541 - 554
  • [2] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [3] Chao, 2012, MATH PROBL ENG
  • [4] Depth and homogeneity in regional flood frequency analysis
    Chebana, F.
    Ouarda, T. B. M. J.
    [J]. WATER RESOURCES RESEARCH, 2008, 44 (11)
  • [5] Regional Frequency Analysis at Ungauged Sites with the Generalized Additive Model
    Chebana, F.
    Charron, C.
    Ouarda, T. B. M. J.
    Martel, B.
    [J]. JOURNAL OF HYDROMETEOROLOGY, 2014, 15 (06) : 2418 - 2428
  • [6] Physiographical space-based kriging for regional flood frequency estimation at ungauged sites
    Chokmani, K
    Ouarda, TBMJ
    [J]. WATER RESOURCES RESEARCH, 2004, 40 (12) : 1 - 13
  • [7] Comparison of ice-affected streamflow estimates computed using artificial neural networks and multiple regression techniques
    Chokmani, Karem
    Ouarda, Taha B. M. J.
    Hamilton, Stuart
    Ghedira, M. Hosni
    Gingras, Hugo
    [J]. JOURNAL OF HYDROLOGY, 2008, 349 (3-4) : 383 - 396
  • [8] A Nonlinear Approach to Regional Flood Frequency Analysis Using Projection Pursuit Regression
    Durocher, Martin
    Chebana, Fateh
    Ouarda, Taha B. M. J.
    [J]. JOURNAL OF HYDROMETEOROLOGY, 2015, 16 (04) : 1561 - 1574
  • [9] A comparison of index flood estimation procedures for ungauged catchments
    Grover, PL
    Burn, DH
    Cunderlik, JM
    [J]. CANADIAN JOURNAL OF CIVIL ENGINEERING, 2002, 29 (05) : 734 - 741
  • [10] Regional flood frequency analysis in eastern Australia: Bayesian GLS regression-based methods within fixed region and ROI framework - Quantile Regression vs. Parameter Regression Technique
    Haddad, Khaled
    Rahman, Ataur
    [J]. JOURNAL OF HYDROLOGY, 2012, 430 : 142 - 161