Comparison of Variable Selection Methods among Dominant Tree Species in Different Regions on Forest Stock Volume Estimation

被引:11
作者
Fang, Gengsheng [1 ,2 ]
Fang, Luming [1 ,2 ]
Yang, Laibang [3 ]
Wu, Dasheng [1 ,2 ]
机构
[1] Zhejiang A&F Univ, Key Lab Forestry Intelligent Monitoring & Informa, Hangzhou 311300, Peoples R China
[2] Zhejiang A&F Univ, Coll Math & Comp Sci, Hangzhou 311300, Peoples R China
[3] Hangzhou Ganzhi Technol Co Ltd, Hangzhou 310000, Peoples R China
关键词
FSV; variable selection; dominant tree species; XGBoost; RFE; PI; MDI; STEM VOLUME; LANDSAT-TM; SENTINEL-2; LIDAR; AREA; IMAGERY;
D O I
10.3390/f13050787
中图分类号
S7 [林业];
学科分类号
0829 ; 0907 ;
摘要
The forest stock volume (FSV) is one of the crucial indicators to reflect the quality of forest resources. Variable selection methods are usually used for FSV estimated models. However, few studies have explored which variable selection methods can make the selected data set have better explanatory and robustness for the same dominant tree species in different regions after the feature variables were filtered by the feature selection methods. In this study, we chose six dominant tree species from Lin'an District, Anji County, and a part of Longquan City. The tree species include broad-leaved, coniferous, Masson pine, Chinese fir, coniferous and broad-leaved mixed forest, and all tree species which include the above five groups of tree species. The last two tree species were represented by mixed and all, respectively. Then, the satellite images, terrain factors, and forest inventory data were selected by six variable selection methods (least absolute shrinkage and selection operator (LASSO), recursive feature elimination (RFE), stepwise regression (Step-Reg), permutation importance (PI), mean decrease impurity (MDI), and SelectFromModel based on LightGBM (SFM)), according to different dominant tree types in different regions. The selected variables were formed into a new dataset divided by different dominant trees. Besides, extreme gradient boosting (XGBoost) was used, combined with variable selection methods to estimate the FSV. The performed results are as follows: In the feature selection of coniferous, RFE performed better both in the average and in the separate regions. In the feature selection of Chinese fir and all, PI performed better both in the average and in the separate regions. In the feature selection of Masson pine, MDI performed better both in the average and in the separate regions. In the feature selection of mixed, MDI performed better in the average while RFE performed better in the separate regions comprehensively. The results showed that not only in separate regions, but the average result two factors, RFE, MDI, and PI all performed well to select variables to estimate the FSV. Furthermore, we selected the top five high feature-importance factors of different tree types, and the results showed that tree age and canopy density were both of great importance to the estimation of FSV. Besides, in the exhibited results of feature selection methods, compared with no variable selection, the research also found that variable selection can improve the performance of the model. Additionally, from the results of different tree types in different regions, we also found that small-scale and diversity of dominant tree types may lead to the instability and unreliability of experimental results. The study provides some insight into the application the optimal variable selection methods of the same dominant tree type in different regions. This study will help the development of variable selection methods to estimate FSV.
引用
收藏
页数:22
相关论文
共 44 条
[1]   A study of forest biomass estimates from lidar in the northern temperate forests of New England [J].
Ahmed, Razi ;
Siqueira, Paul ;
Hensley, Scott .
REMOTE SENSING OF ENVIRONMENT, 2013, 130 :121-135
[2]  
[Anonymous], GLOBAL FOREST RESOUR
[3]  
[Anonymous], 2021, IEEE Trans. Broadcast.
[4]   Comparison of Sentinel-2 and Landsat 8 imagery for forest variable prediction in boreal region [J].
Astola, Heikki ;
Hame, Tuomas ;
Sirro, Laura ;
Molinier, Matthieu ;
Kilpi, Jorma .
REMOTE SENSING OF ENVIRONMENT, 2019, 223 :257-273
[5]  
Chen T., 2016, XGBoost: A Scalable Tree Boosting System|Semantic ScholarEB/OL, V13, P785
[6]  
Gao L.L., 2017, THESIS SHANDONG AGR
[7]   Less is more: optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application [J].
Georganos, Stefanos ;
Grippa, Tais ;
Vanhuysse, Sabine ;
Lennert, Moritz ;
Shimoni, Michal ;
Kalogirou, Stamatis ;
Wolff, Eleonore .
GISCIENCE & REMOTE SENSING, 2018, 55 (02) :221-242
[8]   Forest biomass estimation through NDVI composites.: The role of remotely sensed data to assess Spanish forests as carbon sinks [J].
Gonzalez-Alonso, F. ;
Merino-De-Miguel, S. ;
Roldan-Zamarron, A. ;
Garcia-Gigorro, S. ;
Cuevas, J. M. .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2006, 27 (23-24) :5409-5415
[9]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[10]   TEXTURAL FEATURES FOR IMAGE CLASSIFICATION [J].
HARALICK, RM ;
SHANMUGAM, K ;
DINSTEIN, I .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1973, SMC3 (06) :610-621