A Tool for Classification and Regression Using Random Forest Methodology: Applications to Landslide Susceptibility Mapping and Soil Thickness Modeling

被引:88
作者
Lagomarsino, Daniela [1 ]
Tofani, V. [1 ]
Segoni, S. [1 ]
Catani, F. [1 ]
Casagli, N. [1 ]
机构
[1] Univ Firenze, Earth Sci Dept, Via La Pira 4, I-50121 Florence, Italy
关键词
Classification and regression; Random forest; Feature selection; Landslide susceptibility maps; LOGISTIC-REGRESSION; NEURAL-NETWORKS; PROBABILITY; UNITS; TREE; DISCRIMINATION; RAINFALL; FUZZY; HAND;
D O I
10.1007/s10666-016-9538-y
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Classification and regression problems are a central issue in geosciences. In this paper, we present Classification and Regression Treebagger (ClaReT), a tool for classification and regression based on the random forest (RF) technique. ClaReT is developed in Matlab and has a simple graphic user interface (GUI) that simplifies the model implementation process, allows the standardization of the method, and makes the classification and regression process reproducible. This tool performs automatically the feature selection based on a quantitative criterion and allows testing a large number of explanatory variables. First, it ranks and displays the parameter importance; then, it selects the optimal configuration of explanatory variables; finally, it performs the classification or regression for an entire dataset. It can also provide an evaluation of the results in terms of misclassification error or root mean squared error. We tested the applicability of ClaReT in two case studies. In the first one, we used ClaReT in classification mode to identify the better subset of landslide conditioning variables (LCVs) and to obtain a landslide susceptibility map (LSM) of the Arno river basin (Italy). In the second case study, we used ClaReT in regression mode to produce a soil thickness map of the Terzona catchment, a small sub-basin of the Arno river basin. In both cases, we performed a validation of the results and a comparison with other state-of-the-art techniques. We found that ClaReT produced better results, with a more straightforward and easy application and could be used as a valuable tool to assess the importance of the variables involved in the modeling.
引用
收藏
页码:201 / 214
页数:14
相关论文
共 74 条
[1]   Computer-assisted discrimination of morphological units on north-central Crete (Greece) by applying multivariate statistics to local relief gradients [J].
Adediran, AO ;
Parcharidis, I ;
Poscolieri, M ;
Pavlopoulos, K .
GEOMORPHOLOGY, 2004, 58 (1-4) :357-370
[2]   An easy-to-use MATLAB program (MamLand) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm [J].
Akgun, A. ;
Sezer, E. A. ;
Nefeslioglu, H. A. ;
Gokceoglu, C. ;
Pradhan, B. .
COMPUTERS & GEOSCIENCES, 2012, 38 (01) :23-34
[3]  
[Anonymous], 1990, Applied Linear Statistical Models: Regression, Analysis of Variance, and Experimental Designs
[4]  
[Anonymous], 1990, Bulletin of the International Association of Engineering Geology, P13, DOI [DOI 10.1007/BF02590202, 10.1007/BF02590202]
[5]  
[Anonymous], 1990, Classical and modern regression with applications
[6]  
[Anonymous], 2013, Landslide Science and Practice, DOI DOI 10.1007/978-3-642-31325-7_38
[7]   Hillslope characteristics as controls of subsurface flow variability [J].
Bachmair, S. ;
Weiler, M. .
HYDROLOGY AND EARTH SYSTEM SCIENCES, 2012, 16 (10) :3699-3715
[8]   Assessment of shallow landslide susceptibility by means of multivariate statistical techniques [J].
Baeza, C ;
Corominas, J .
EARTH SURFACE PROCESSES AND LANDFORMS, 2001, 26 (12) :1251-1263
[9]   Radiocarbon data on lateglacial and Holocene landslides in the Northern Apennines [J].
Bertolini, G ;
Casagli, N ;
Ermini, L ;
Malaguti, C .
NATURAL HAZARDS, 2004, 31 (03) :645-662
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32