Explainable machine learning-based fractional vegetation cover inversion and performance optimization - A case study of an alpine grassland on the Qinghai-Tibet Plateau

被引:9
作者
Li, Xinhong [1 ]
Chen, Jianjun [1 ,2 ]
Chen, Zizhen [1 ]
Lan, Yanping [1 ]
Ling, Ming [1 ]
Huang, Qinyi [1 ]
Li, Hucheng [1 ]
Han, Xiaowen [1 ,2 ]
Yi, Shuhua [3 ]
机构
[1] Guilin Univ Technol, Coll Geomat & Geoinformat, Guilin 541004, Peoples R China
[2] Guilin Univ Technol, Guangxi Key Lab Spatial Informat & Geomat, Guilin 541004, Peoples R China
[3] Nantong Univ, Sch Geog Sci, Nantong 226007, Peoples R China
基金
中国国家自然科学基金;
关键词
Fractional vegetation cover; Machine learning; Underlying surface heterogeneity; SHapley Additive exPlanations; Optuna; ABOVEGROUND BIOMASS; FEATURE-SELECTION; ALGORITHM; NDVI; LAI;
D O I
10.1016/j.ecoinf.2024.102768
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Fractional Vegetation Cover (FVC) serves as a crucial indicator in ecological sustainability and climate change monitoring. While machine learning is the primary method for FVC inversion, there are still certain shortcomings in feature selection, hyperparameter tuning, underlying surface heterogeneity, and explainability. Addressing these challenges, this study leveraged extensive FVC field data from the Qinghai-Tibet Plateau. Initially, a feature selection algorithm combining genetic algorithms and XGBoost was proposed. This algorithm was integrated with the Optuna tuning method, forming the GA-OP combination to optimize feature selection and hyperparameter tuning in machine learning. Furthermore, comparative analyses of various machine learning models for FVC inversion in alpine grassland were conducted, followed by an investigation into the impact of the underlying surface heterogeneity on inversion performance using the NDVI Coefficient of Variation (NDVI-CV). Lastly, the SHAP (Shapley Additive exPlanations) method was employed for both global and local interpretations of the optimal model. The results indicated that: (1) GA-OP combination exhibited favorable performance in terms of computational cost and inversion accuracy, with Optuna demonstrating significant potential in hyperparameter tuning. (2) Stacking model achieved optimal performance in FVC inversion for alpine grassland among the seven models (R-2 = 0.867, RMSE = 0.12, RPD = 2.552, BIAS = -0.0005, VAR = 0.014), with the performance ranking as follows: Stacking > CatBoost > XGBoost > LightGBM > RFR > KNN > SVR. (3) NDVI-CV enhanced inversion performance and result reliability by excluding data from highly heterogeneous regions that tended to be either overestimated or underestimated. (4) SHAP revealed the decision-making processes of the Stacking and CatBoost models from both global and local perspectives. This allowed for a deeper exploration of the causality between features and targets. This study developed a high-precision FVC inversion scheme, successfully achieving accurate FVC inversion on the Qinghai-Tibet Plateau. The proposed approach provides valuable references for other ecological parameter inversions.
引用
收藏
页数:20
相关论文
共 84 条
[1]   Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model [J].
Abdollahi, Arnick ;
Pradhan, Biswajeet .
SCIENCE OF THE TOTAL ENVIRONMENT, 2023, 879
[2]   Optuna: A Next-generation Hyperparameter Optimization Framework [J].
Akiba, Takuya ;
Sano, Shotaro ;
Yanase, Toshihiko ;
Ohta, Takeru ;
Koyama, Masanori .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631
[3]   Spatial mapping of landslide susceptibility in Jerash governorate of Jordan using genetic algorithm-based wrapper feature selection and bagging-based ensemble model [J].
Al-Shabeeb, Abdel Rahman ;
Al-Fugara, A'kif ;
Khedher, Khaled Mohamed ;
Mabdeh, Ali Nouh ;
Al-Adamat, Rida .
GEOMATICS NATURAL HAZARDS & RISK, 2022, 13 (01) :2252-2282
[4]   Reliable prediction of software defects using Shapley interpretable machine learning models [J].
Al-Smadi, Yazan ;
Eshtay, Mohammed ;
Al-Qerem, Ahmad ;
Nashwan, Shadi ;
Ouda, Osama ;
Abd El-Aziz, A. A. .
EGYPTIAN INFORMATICS JOURNAL, 2023, 24 (03)
[5]   Estimating grassland vegetation cover with remote sensing: A comparison between Landsat-8, Sentinel-2 and PlanetScope imagery [J].
Andreatta, Davide ;
Gianelle, Damiano ;
Scotton, Michele ;
Dalponte, Michele .
ECOLOGICAL INDICATORS, 2022, 141
[6]   A comparative analysis of machine learning techniques for aboveground biomass estimation: A case study of the Western Ghats, India [J].
Ayushi, Kurian ;
Babu, Kanda Naveen ;
Ayyappan, Narayanan ;
Nair, Jaishanker Raghunathan ;
Kakkara, Athira ;
Reddy, C. Sudhakar .
ECOLOGICAL INFORMATICS, 2024, 80
[7]   Prediction of Neuropeptides from Sequence Information Using Ensemble Classifier and Hybrid Features [J].
Bin, Yannan ;
Zhang, Wei ;
Tang, Wending ;
Dai, Ruyu ;
Li, Menglu ;
Zhu, Qizhi ;
Xia, Junfeng .
JOURNAL OF PROTEOME RESEARCH, 2020, 19 (09) :3732-3740
[8]   A Fast Regression-Based Approach to Map Water Status of Pomegranate Orchards with Sentinel 2 Data [J].
Borgogno-Mondino, Enrico ;
Farbo, Alessandro ;
Novello, Vittorino ;
de Palma, Laura .
HORTICULTURAE, 2022, 8 (09)
[9]  
Breiman L., 2001, MACH LEARN, V45, P5, DOI 10.1023/a:1010933404324
[10]   Hybrid machine learning models for aboveground biomass estimations [J].
Bui, Quang -Thanh ;
Pham, Quang-Tuan ;
Pham, Van -Manh ;
Tran, Van-Thuy ;
Nguyen, Dinh -Hung ;
Nguyen, Quoc-Huy ;
Nguyen, Huu-Duy ;
Do, Nhung Thi ;
Vu, Van -Manh .
ECOLOGICAL INFORMATICS, 2024, 79