Prediction of gully erosion susceptibility through the lens of the SHapley Additive exPlanations (SHAP) method using a stacking ensemble model

被引：0

作者：

Han, Jeongho ^{[1
,2
]}

Guzman, Jorge A. ^{[1
]}

Chu, Maria L. ^{[1
]}

机构：

[1] Univ Illinois, GRAINGER Coll Engn, Coll Agr Consumer & Environm Sci, Dept Agr & Biol Engn,ACES, Urbana, IL 61801 USA

[2] Kangwon Natl Univ, Agr & Life Sci Res Inst, Chunchon 24341, South Korea

来源：

JOURNAL OF ENVIRONMENTAL MANAGEMENT | 2025年 / 383卷

基金：

美国国家科学基金会;

关键词：

Gully erosion susceptibility; Stacked generalization; Explainable artificial intelligence; Machine learning; Shapley additive explanations; LANDSLIDE SUSCEPTIBILITY; SOIL-EROSION; CLASSIFICATION; REGRESSION; CURVATURE; SCALE;

D O I：

10.1016/j.jenvman.2025.125478

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

This study develops a novel explainable stacking ensemble model that combines the stacked generalization ensemble method with SHapley Additive exPlanations (SHAP) to enhance the prediction and interpretation of gully erosion susceptibility. Applied to Jefferson County, Illinois, our approach leverages Random Forest (RF), Gradient Boosting Machine (GBM), Logistic Regression (LR), and Deep Neural Networks (DNN) as both base and meta-learners in various configurations, resulting in 44 distinct stacking models. The comparative analysis demonstrated the superior predictive performance of the stacked models when evaluated at 200 randomly gully sites selected points based on LiDAR difference observations; all but three exceeded the highest area under the curve (AUC) value of 0.86 achieved by the best-performing base model (GBM). The LR stacking model, combining RF and GBM as base models with LR as the meta-learner, emerged as the most effective, achieving an AUC of 0.916. The resulting gully erosion susceptibility map by the LR stacking model classified 33 % of the agricultural land (89,208 ha) as the "very high" class, compared to 27 %, 87 %, 27 %, and 55 % predicted by individual RF, LR, GBM, and DNN models, respectively. Crucially, SHAP analysis elucidated how changes in feature values influence model behavior, considering feature interactions within both the base models and the meta-learner. The SHAP identified the annual leaf area index (LAI) as the most influential feature in both RF and GBM base models. Additionally, it highlights the significance of the GBM model in comparison to the RF base model in the final decision-making process of the stacking model. By offering a transparent mechanism to evaluate how different features and models contribute to final decisions, this approach can be extended to broader environmental management and policy-making contexts, facilitating more informed and responsible resource allocation.

引用

页数：13

共 55 条

[1] Mapping erosion susceptibility by a multivariate statistical method: A case study from the Ayvalik region, NW Turkey [J].