Assessing spatial transferability of a random forest metamodel for predicting drainage fraction

被引:15
作者
Bjerre, Elisa [1 ,2 ]
Fienen, Michael N. [3 ]
Schneider, Raphael [2 ]
Koch, Julian [2 ]
Hojberg, Anker L. [2 ]
机构
[1] Univ Copenhagen UCPH, Dept Geosci & Nat Resource Management IGN, Oester Voldgade 10, DK-1350 Copenhagen, Denmark
[2] Geol Survey Denmark & Greenland GEUS, Oester Voldgade 10, DK-1350 Copenhagen, Denmark
[3] US Geol Survey Upper Midwest Water Sci Ctr, 8505 Res Way, Middleton, WI 53562 USA
基金
欧盟地平线“2020”;
关键词
Machine learning; Predictive modelling; Model portability; Model generalization; Histogram distance; Drain partitioning; CENTRAL VALLEY; TILE DRAINAGE; WATER-FLOW; GROUNDWATER; MODELS; SURFACE; SCALE; CALIFORNIA; DISCHARGE; PATTERNS;
D O I
10.1016/j.jhydrol.2022.128177
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Fully distributed hydrological models are widely used in groundwater management, but model speed and data requirements impede their use for decision support purposes. Metamodels provide a simpler and faster model which emulates the underlying complex model using machine learning techniques. However, metamodel pre-dictions beyond the ranges, in space and/or time, of training data are highly uncertain, and thus it is important to assess the predictive model performance to ranges outside the training data, i.e., model transferability. We present a novel methodology for evaluating model transferability to areas not contained in the training data set, based on various metrics that quantify the differences in covariate distributions between training and testing data. The transferability method can be employed as a screening tool to assess the suitability of a metamodel for spatial prediction beyond its training domain. We evaluated this transferability approach on a Random Forest meta -model of a 1000 km2 fully distributed coupled groundwater model for predicting drainage fraction, the partitioning of infiltrating water between drains and groundwater. We conducted spatial cross-validation on 9 holdout sub-basins to assess metamodel transferability beyond sampling locations and compared this estimate with a random split-sample validation test. Using mappable covariates only, the metamodel showed high performance (R2 = 0.79) tested on a 20% randomly sampled holdout. Conversely, metamodel performance significantly decreased for the 9 spatial holdouts (R(2 )ranging from 0.13 to 0.61). We document that the proposed transferability metric correlates with metamodel predictive performance, and demonstrate its use to assess model transferability to datasets outside the training data spatial domain.
引用
收藏
页数:11
相关论文
共 48 条
[1]   Susceptibility to Gully Erosion: Applying Random Forest (RF) and Frequency Ratio (FR) Approaches to a Small Catchment in Ethiopia [J].
Amare, Selamawit ;
Langendoen, Eddy ;
Keesstra, Saskia ;
Ploeg, Martine van der ;
Gelagay, Habtamu ;
Lemma, Hanibal ;
van der Zee, Sjoerd E. A. T. M. .
WATER, 2021, 13 (02)
[2]   A review of surrogate models and their application to groundwater modeling [J].
Asher, M. J. ;
Croke, B. F. W. ;
Jakeman, A. J. ;
Peeters, L. J. M. .
WATER RESOURCES RESEARCH, 2015, 51 (08) :5957-5973
[3]   Multiorder Hydrologic Position in the Conterminous United States: A Set of Metrics in Support of Groundwater Mapping at Regional and National Scales [J].
Belitz, Kenneth ;
Moore, Richard B. ;
Arnold, Terri L. ;
Sharpe, Jennifer B. ;
Starn, J. J. .
WATER RESOURCES RESEARCH, 2019, 55 (12) :11188-11207
[4]   CONSTRUCTION AND IMPLEMENTATION OF METAMODELS [J].
BLANNING, RW .
SIMULATION, 1975, 24 (06) :177-184
[5]   Homogenization of spatial patterns of hydrologic response in artificially drained agricultural catchments [J].
Boland-Brien, Samuel J. ;
Basu, Nandita B. ;
Schilling, Keith E. .
HYDROLOGICAL PROCESSES, 2014, 28 (19) :5010-5020
[6]   METAMODELS AND NONPOINT POLLUTION POLICY IN AGRICULTURE [J].
BOUZAHER, A ;
LAKSHMINARAYAN, PG ;
CABE, R ;
CARRIQUIRY, A ;
GASSMAN, PW ;
SHOGREN, JF .
WATER RESOURCES RESEARCH, 1993, 29 (06) :1579-1587
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Development of marginal emission factors for N losses from agricultural soils with the DNDC-CAPRI meta-model [J].
Britz, Wolfgang ;
Leip, Adrian .
AGRICULTURE ECOSYSTEMS & ENVIRONMENT, 2009, 133 (3-4) :267-279
[9]   On measuring the distance between histograms [J].
Cha, SH ;
Srihari, SN .
PATTERN RECOGNITION, 2002, 35 (06) :1355-1370
[10]   The evolution of process-based hydrologic models: historical challenges and the collective quest for physical realism [J].
Clark, Martyn P. ;
Bierkens, Marc F. P. ;
Samaniego, Luis ;
Woods, Ross A. ;
Uijlenhoet, Remko ;
Bennett, Katrina E. ;
Pauwels, Valentijn R. N. ;
Cai, Xitian ;
Wood, Andrew W. ;
Peters-Lidard, Christa D. .
HYDROLOGY AND EARTH SYSTEM SCIENCES, 2017, 21 (07) :3427-3440