Effect of Training Class Label Noise on Classification Performances for Land Cover Mapping with Satellite Image Time Series

被引:142
作者
Pelletier, Charlotte [1 ]
Valero, Silvia [1 ]
Inglada, Jordi [1 ]
Champion, Nicolas [2 ]
Sicre, Claire Marais [1 ]
Dedieu, Gerard [1 ]
机构
[1] Univ Toulouse, CESBIO, UMR 5126, CNES,CNRS,IRD,UPS, 18 Ave Edouard Belin, F-31401 Toulouse 9, France
[2] Univ Paris Est Marne la Vallee, IGN Espace, LASTIG, MATIS, 73 Ave Paris, F-94160 St Mande, France
关键词
class label noise; mislabeled training data; satellite image time series; classification; land cover mapping; Support Vector Machines; Random Forests; RANDOM FOREST CLASSIFIER; LEARNING ALGORITHMS; RESOLUTION; MAP; CROP; PHENOLOGY; ACCURACY;
D O I
10.3390/rs9020173
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Supervised classification systems used for land cover mapping require accurate reference databases. These reference data come generally from different sources such as field measurements, thematic maps, or aerial photographs. Due to misregistration, update delay, or land cover complexity, they may contain class label noise, i.e., a wrong label assignment. This study aims at evaluating the impact of mislabeled training data on classification performances for land cover mapping. Particularly, it addresses the random and systematic label noise problem for the classification of high resolution satellite image time series. Experiments are carried out on synthetic and real datasets with two traditional classifiers: Support Vector Machines (SVM) and Random Forests (RF). A synthetic dataset has been designed for this study, simulating vegetation profiles over one year. The real dataset is composed of Landsat-8 and SPOT-4 images acquired during one year in the south of France. The results show that both classifiers are little influenced for low random noise levels up to 25%-30%, but their performances drop down for higher noise levels. Different classification configurations are tested by increasing the number of classes, using different input feature vectors, and changing the number of training instances. Algorithm complexities are also analyzed. The RF classifier achieves high robustness to random and systematic label noise for all the tested configurations; whereas the SVM classifier is more sensitive to the kernel choice and to the input feature vectors. Finally, this work reveals that the cross-validation procedure is impacted by the presence of class label noise.
引用
收藏
页数:24
相关论文
共 75 条
  • [1] Mapping abandoned agriculture with multi-temporal MODIS satellite data
    Alcantara, Camilo
    Kuemmerle, Tobias
    Prishchepov, Alexander V.
    Radeloff, Volker C.
    [J]. REMOTE SENSING OF ENVIRONMENT, 2012, 124 : 334 - 347
  • [2] [Anonymous], 2011, AS C MACH LEARN
  • [3] Improved monitoring of vegetation dynamics at very high latitudes: A new method using MODIS NDVI
    Beck, PSA
    Atzberger, C
    Hogda, KA
    Johansen, B
    Skidmore, AK
    [J]. REMOTE SENSING OF ENVIRONMENT, 2006, 100 (03) : 321 - 334
  • [4] Random forest in remote sensing: A review of applications and future directions
    Belgiu, Mariana
    Dragut, Lucian
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 114 : 24 - 31
  • [5] Data mining for credit card fraud: A comparative study
    Bhattacharyya, Siddhartha
    Jha, Sanjeev
    Tharakunnel, Kurian
    Westland, J. Christopher
    [J]. DECISION SUPPORT SYSTEMS, 2011, 50 (03) : 602 - 613
  • [6] Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics
    Boulesteix, Anne-Laure
    Janitza, Silke
    Kruppa, Jochen
    Koenig, Inke R.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (06) : 493 - 507
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [9] Breiman L., 1984, Classification and regression trees, DOI DOI 10.1201/9781315139470
  • [10] Identifying mislabeled training data
    Brodley, CE
    Friedl, MA
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 : 131 - 167