Classification Efficacy Using K-Fold Cross-Validation and Bootstrapping Resampling Techniques on the Example of Mapping Complex Gully Systems

被引:29
作者
Phinzi, Kwanele [1 ]
Abriha, David [1 ]
Szabo, Szilard [2 ]
机构
[1] Univ Debrecen, Fac Sci & Technol, Doctoral Sch Earth Sci, Dept Phys Geog & Geoinformat, Egyet Ter 1, H-4032 Debrecen, Hungary
[2] Univ Debrecen, Fac Sci & Technol, Dept Phys Geog & Geoinformat, Egyet Ter 1, H-4032 Debrecen, Hungary
关键词
satellite imagery; gully mapping; machine learning; random forest; support vector machines; South Africa; semi-arid environment; EASTERN CAPE; SOIL-EROSION; SOUTH-AFRICA; IMAGE CLASSIFICATION; SPATIAL-DISTRIBUTION; RANDOM FOREST; DATA FUSION; ACCURACY; MACHINE; REGION;
D O I
10.3390/rs13152980
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The availability of aerial and satellite imageries has greatly reduced the costs and time associated with gully mapping, especially in remote locations. Regardless, accurate identification of gullies from satellite images remains an open issue despite the amount of literature addressing this problem. The main objective of this work was to investigate the performance of support vector machines (SVM) and random forest (RF) algorithms in extracting gullies based on two resampling methods: bootstrapping and k-fold cross-validation (CV). In order to achieve this objective, we used PlanetScope data, acquired during the wet and dry seasons. Using the Normalized Difference Vegetation Index (NDVI) and multispectral bands, we also explored the potential of the PlanetScope image in discriminating gullies from the surrounding land cover. Results revealed that gullies had significantly different (p < 0.001) spectral profiles from any other land cover class regarding all bands of the PlanetScope image, both in the wet and dry seasons. However, NDVI was not efficient in gully discrimination. Based on the overall accuracies, RF's performance was better with CV, particularly in the dry season, where its performance was up to 4% better than the SVM's. Nevertheless, class level metrics (omission error: 11.8%; commission error: 19%) showed that SVM combined with CV was more successful in gully extraction in the wet season. On the contrary, RF combined with bootstrapping had relatively low omission (16.4%) and commission errors (10.4%), making it the most efficient algorithm in the dry season. The estimated gully area was 88 +/- 14.4 ha in the dry season and 57.2 +/- 18.8 ha in the wet season. Based on the standard error (8.2 ha), the wet season was more appropriate in gully identification than the dry season, which had a slightly higher standard error (8.6 ha). For the first time, this study sheds light on the influence of these resampling techniques on the accuracy of satellite-based gully mapping. More importantly, this study provides the basis for further investigations into the accuracy of such resampling techniques, especially when using different satellite images other than the PlanetScope data.
引用
收藏
页数:18
相关论文
共 71 条
[1]   Land-use/cover classification in a heterogeneous coastal landscape using RapidEye imagery: evaluating the performance of random forest and support vector machines classifiers [J].
Adam, Elhadi ;
Mutanga, Onisimo ;
Odindi, John ;
Abdel-Rahman, Elfatih M. .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2014, 35 (10) :3440-3458
[2]  
[Anonymous], 2007, Strategic plan, 2007-2010
[3]  
[Anonymous], 2021, IEEE Trans. Broadcast.
[4]  
Beckedahl H R., 2000, South African Geographical Journal, V82, P157, DOI [DOI 10.1080/03736245.2000.9713709, 10.1080/10106049.2015.1047412, DOI 10.1080/10106049.2015.1047412]
[5]  
Boehmke B., 2019, Hands-on Machine Learning With R
[6]   An assessment of the global impact of 21st century land use change on soil erosion [J].
Borrelli, Pasquale ;
Robinson, David A. ;
Fleischer, Larissa R. ;
Lugato, Emanuele ;
Ballabio, Cristiano ;
Alewell, Christine ;
Meusburger, Katrin ;
Modugno, Sirio ;
Schuett, Brigitta ;
Ferro, Vito ;
Bagarello, Vincenzo ;
Van Oost, Kristof ;
Montanarella, Luca ;
Panagos, Panos .
NATURE COMMUNICATIONS, 2017, 8
[7]   Spatial prediction models for landslide hazards: review, comparison and evaluation [J].
Brenning, A .
NATURAL HAZARDS AND EARTH SYSTEM SCIENCES, 2005, 5 (06) :853-862
[8]   Ephemeral gully erosion in southern Navarra (Spain) [J].
Casalí, J ;
López, JJ ;
Giráldez, JV .
CATENA, 1999, 36 (1-2) :65-84
[9]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[10]   A REVIEW OF ASSESSING THE ACCURACY OF CLASSIFICATIONS OF REMOTELY SENSED DATA [J].
CONGALTON, RG .
REMOTE SENSING OF ENVIRONMENT, 1991, 37 (01) :35-46