Debris Flow Susceptibility and Its Reliability Based on Random Forest and GIS

被引:9
作者
Zhang S. [1 ]
Wu G. [1 ]
机构
[1] Faculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu
来源
Diqiu Kexue - Zhongguo Dizhi Daxue Xuebao/Earth Science - Journal of China University of Geosciences | 2019年 / 44卷 / 09期
关键词
Conditioning factors combination; Debris-flow susceptibility; Engineering geology; GIS; Quadratic discriminant analysis; Random forest; Reliability; Support vector machine;
D O I
10.3799/dqkx.2019.081
中图分类号
学科分类号
摘要
Nowadays models extensively used in GIS for debris-flow susceptibility (DFS) assessment remain obviously inadequate. In models based on classical statistical theory (e.g. information value, weight of evidence, and certainty factors), the independence between debris-flow conditioning factors is necessary, and the weight of these factors depends on the classification method. The linear machine learning may fail in nonlinear classification problems, whereas hyper-parameter tuning of usual nonlinear techniques is always difficult. Random forest (RF) is capable of resolving the most of problems of these usual models, but have hardly been applied in DFS assessment. This article aims to investigate the DFS assessment of RF and evaluate its reliability, using 4 models with the hyper-parameters tuning of Bayesian optimization,random forest (RF),linear support vector machine (LSVM), radial basis function-support vector machine (RBF-SVM), and quadratic discriminant analysis (QDA), and 26 conditioning factors. A modified five-fold cross-validation method is adopted to evaluate DFS assessment firstly, and then the rank of the relative weight of RF and Monte Carlo method are used respectively, to investigate the reliability of DFS assessment under the different combinations of debris-flow conditioning factors or the random sample split. Results demonstrate that 21 out of 26 debris-flow conditioning factors indicate the difference of the environments with different debris-flow rates. Relative weight rank of RF, can effectively determine the local optimal combination of factors for the 4 models. The uncertainty of susceptibility assessment resulting from the random sample split is most significant in the medium susceptibility zone (0.4~0.6), and can be reduced by increasing the proportion of the model building sample and improving the susceptibility model. The prediction performance of RF is: AUC=0.86, overall accuracy=0.79, F1 score=0.66 and brier score=0.14. And their reliability is optimal in all these 4 models. Therefore, RF can be a superior model for quantitative DFS assessment. © 2019, Editorial Department of Earth Science. All right reserved.
引用
收藏
页码:3115 / 3134
页数:19
相关论文
共 38 条
[1]  
Agterberg F.P., Cheng Q.M., Conditional Independence Test for Weights-of-Evidence Modeling, Natural Resources Research, 11, 4, pp. 249-255, (2002)
[2]  
Alin A., Multicollinearity, Wiley Interdisciplinary Reviews: Computational Statistics, 2, 3, pp. 370-374, (2010)
[3]  
Birolini A., Reliability Engineering: Theory and Practice, (2017)
[4]  
Breiman L., Random Forests, Machine Learning, 45, 1, pp. 5-32, (2001)
[5]  
Brier G.W., Verification of Forecasts Expressed in Terms of Probability, Monthly Weather Review, 78, 1, pp. 1-3, (1950)
[6]  
Chen C.Y., Yu F.C., Morphometric Analysis of Debris Flows and Their Source Areas Using GIS, Geomorphology, 129, 3-4, pp. 387-397, (2011)
[7]  
Chen J., Li Y., Zhou W., Et al., Debris-Flow Susceptibility Assessment Model and Its Application in Semiarid Mountainous Areas of the Southeastern Tibetan Plateau, Natural Hazards Review, 18, 2, (2017)
[8]  
Chevalier G.G., Medina V., Hurlimann M., Et al., Debris-Flow Susceptibility Analysis Using Fluvio- Morphological Parameters and Data Mining: Application to the Central-Eastern Pyrenees, Natural Hazards, 67, 2, pp. 213-238, (2013)
[9]  
Cortes C., Vapnik V., Support-Vector Networks, Machine Learning, 20, 3, pp. 273-297, (1995)
[10]  
Devkota K.C., Regmi A.D., Pourghasemi H.R., Et al., Landslide Susceptibility Mapping Using Certainty Factor, Index of Entropy and Logistic Regression Models in GIS and Their Comparison at Mugling-Narayanghat Road Section in Nepal Himalaya, Natural Hazards, 65, 1, pp. 135-165, (2013)