Impact of sampling for landslide susceptibility assessment using interpretable machine learning models

被引:7
作者
Wu, Bin [1 ,2 ]
Shi, Zhenming [1 ,2 ]
Zheng, Hongchao [1 ,2 ]
Peng, Ming [1 ,2 ]
Meng, Shaoqiang [1 ,2 ,3 ]
机构
[1] Tongji Univ, Key Lab Geotech & Underground Engn, Minist Educ, Shanghai 200092, Peoples R China
[2] Tongji Univ, Coll Civil Engn, Dept Geotech Engn, Shanghai 200092, Peoples R China
[3] Tongji Univ, Shanghai Res Inst Intelligent Autonomous Syst, Shanghai 200092, Peoples R China
基金
英国科研创新办公室;
关键词
Landslide susceptibility; Landslide risk; Machine learning; Sample strategy; Hazard assessment; LOGISTIC-REGRESSION; SLOPE UNITS; UNCERTAINTIES; OPTIMIZATION; EROSION; SYSTEM;
D O I
10.1007/s10064-024-03980-8
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Landslide susceptibility assessment has made significant strides in meeting the urgent requirements for disaster prevention and mitigation. However, the inherent imbalance in landslide distributions poses challenges and thus various sampling strategies emerge. Yet, these strategies alter the original dataset distribution, necessitating a deeper understanding of their impact on susceptibility mapping. This study integrates multi-source information, including morphological, geological, hydrological, and land-use data in the northwest of Oregon State, to train four models-Decision Trees, Random Forest, Adaboost, and Gradient Tree Boosting -using both balanced and imbalanced training sets. Results reveal that models trained on imbalanced datasets generally exhibit superior classification performance. Models using balanced datasets predict more positives (landslides) at higher susceptibility levels, while those applied imbalanced datasets classified more negatives at lower levels. By employing the Shapley Additive Explanations method, the consistency in model decision-making was established and identified the top five most influential factors: distance to roads, slope roughness, geological age, roughness, and elevation. Furthermore, the consequences of FN (False Negatives) and FP (False Positives) were discussed, concluding that FN may lead to loss of life, and FP may result from prediction inaccuracies, dataset incompleteness, and forthcoming landslides, hence allowing for a certain amount. It suggests that models with balanced datasets are preferable for minimizing the quantity of FN and effectively capturing landslides at high and very high susceptibility areas. The findings provide valuable insights into the impact of positives and negatives ratios on landslide susceptibility and offer support for optimizing dataset sampling.
引用
收藏
页数:19
相关论文
共 92 条
[1]   A novel hybrid approach of Bayesian Logistic Regression and its ensembles for landslide susceptibility assessment [J].
Abedini, Mousa ;
Ghasemian, Bahareh ;
Shirzadi, Ataollah ;
Shahabi, Himan ;
Chapi, Kamran ;
Binh Thai Pham ;
Bin Ahmad, Baharin ;
Dieu Tien Bui .
GEOCARTO INTERNATIONAL, 2019, 34 (13) :1427-1457
[2]   GIS-based landslide susceptibility modeling: A comparison between fuzzy multi-criteria and machine learning algorithms [J].
Ali, Sk Ajim ;
Parvin, Farhana ;
Vojtekova, Jana ;
Costache, Romulus ;
Nguyen Thi Thuy Linh ;
Quoc Bao Pham ;
Vojtek, Matej ;
Gigovic, Ljubomir ;
Ahmad, Ateeque ;
Ghorbani, Mohammad Ali .
GEOSCIENCE FRONTIERS, 2021, 12 (02) :857-876
[3]   Automatic delineation of geomorphological slope units with r.slopeunits v1.0 and their optimization for landslide susceptibility modeling [J].
Alvioli, Massimiliano ;
Marchesini, Ivan ;
Reichenbach, Paola ;
Rossi, Mauro ;
Ardizzone, Francesca ;
Fiorucci, Federica ;
Guzzetti, Fausto .
GEOSCIENTIFIC MODEL DEVELOPMENT, 2016, 9 (11) :3975-3991
[4]   Landslide Catastrophes and Disaster Risk Reduction: A GIS Framework for Landslide Prevention and Management [J].
Assilzadeh, Hamid ;
Levy, Jason K. ;
Wang, Xin .
REMOTE SENSING, 2010, 2 (09) :2259-2273
[5]   The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan [J].
Ayalew, L ;
Yamagishi, H .
GEOMORPHOLOGY, 2005, 65 (1-2) :15-31
[6]   A comparison of slope units and grid cells as mapping units for landslide susceptibility assessment [J].
Ba, Qianqian ;
Chen, Yumin ;
Deng, Susu ;
Yang, Jiaxin ;
Li, Huifang .
EARTH SCIENCE INFORMATICS, 2018, 11 (03) :373-388
[7]  
Bergstra J., 2013, INT C MACHINE LEARNI, P115, DOI DOI 10.5555/3042817.3042832
[8]  
Brabb EE, 1985, INT LANDSL S P TOR C, P17
[9]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[10]  
Burns WJ, 2009, DOGAMI