A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources

被引:447
作者
Tyralis, Hristos [1 ]
Papacharalampous, Georgia [2 ]
Langousis, Andreas [3 ]
机构
[1] Elefsina Air Base, Air Force Support Command, Hellen Air Force, Elefsina 19200, Greece
[2] Natl Tech Univ Athens, Sch Civil Engn, Dept Water Resources & Environm Engn, Iroon Polytech 5, Zografos 15780, Greece
[3] Univ Patras, Sch Engn, Dept Civil Engn, Univ Campus, Patras 26504, Greece
关键词
classification; data-driven; hydrological modeling; hydrology; machine learning; prediction; quantile regression forests; supervised learning; variable importance metrics; MACHINE LEARNING-METHODS; VARIABLE IMPORTANCE MEASURES; SUPPORT VECTOR MACHINE; NEURAL-NETWORK MODELS; ARTIFICIAL-INTELLIGENCE; BIG DATA; OVERLAND-FLOW; SOIL-MOISTURE; LAND-COVER; UPSCALING EVAPOTRANSPIRATION;
D O I
10.3390/w11050910
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Random forests (RF) is a supervised machine learning algorithm, which has recently started to gain prominence in water resources applications. However, existing applications are generally restricted to the implementation of Breiman's original algorithm for regression and classification problems, while numerous developments could be also useful in solving diverse practical problems in the water sector. Here we popularize RF and their variants for the practicing water scientist, and discuss related concepts and techniques, which have received less attention from the water science and hydrologic communities. In doing so, we review RF applications in water resources, highlight the potential of the original algorithm and its variants, and assess the degree of RF exploitation in a diverse range of applications. Relevant implementations of random forests, as well as related concepts and techniques in the R programming language, are also covered.
引用
收藏
页数:37
相关论文
共 336 条
[41]   Comparing methods for estimating flow duration curves at ungauged sites [J].
Booker, D. J. ;
Snelder, T. H. .
JOURNAL OF HYDROLOGY, 2012, 434 :78-94
[42]   Inside or Outside: Quantifying Extrapolation Across River Networks [J].
Booker, Douglas J. ;
Whitehead, Amy L. .
WATER RESOURCES RESEARCH, 2018, 54 (09) :6983-7003
[43]   Making complex prediction rules applicable for readers: Current practice in random forest literature and recommendations [J].
Boulesteix, Anne-Laure ;
Janitza, Silke ;
Hornung, Roman ;
Probst, Philipp ;
Busen, Hannah ;
Hapfelmeier, Alexander .
BIOMETRICAL JOURNAL, 2019, 61 (05) :1314-1328
[44]   On the necessity and design of studies comparing statistical methods [J].
Boulesteix, Anne-Laure ;
Binder, Harald ;
Abrahamowicz, Michal ;
Sauerbrei, Willi .
BIOMETRICAL JOURNAL, 2018, 60 (01) :216-218
[45]   A Statistical Framework for Hypothesis Testing in Real Data Comparison Studies [J].
Boulesteix, Anne-Laure ;
Hable, Robert ;
Lauer, Sabine ;
Eugster, Manuel J. A. .
AMERICAN STATISTICIAN, 2015, 69 (03) :201-212
[46]   Letter to the Editor: On the term 'interaction' and related phrases in the literature on Random Forests [J].
Boulesteix, Anne-Laure ;
Janitza, Silke ;
Hapfelmeier, Alexander ;
Van Steen, Kristel ;
Strobl, Carolin .
BRIEFINGS IN BIOINFORMATICS, 2015, 16 (02) :338-345
[47]   Machine learning versus statistical modeling [J].
Boulesteix, Anne-Laure ;
Schmid, Matthias .
BIOMETRICAL JOURNAL, 2014, 56 (04) :588-593
[48]   Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics [J].
Boulesteix, Anne-Laure ;
Janitza, Silke ;
Kruppa, Jochen ;
Koenig, Inke R. .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (06) :493-507
[49]   Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations [J].
Boulesteix, Anne-Laure ;
Bender, Andreas ;
Bermejo, Justo Lorenzo ;
Strobl, Carolin .
BRIEFINGS IN BIOINFORMATICS, 2012, 13 (03) :292-304
[50]   Input determination for neural network models in water resources applications. Part 1 - background and methodology [J].
Bowden, GJ ;
Dandy, GC ;
Maier, HR .
JOURNAL OF HYDROLOGY, 2005, 301 (1-4) :75-92