Training set size requirements for the classification of a specific class

被引:261
作者
Foody, Giles M. [1 ]
Mathur, Ajay
Sanchez-Hernandez, Carolina
Boyd, Doreen S.
机构
[1] Univ Southampton, Sch Geog, Southampton SO17 1BJ, Hants, England
[2] Punjab Remote Sensing Ctr, Ludhiana 141004, Punjab, India
[3] Ordnance Survey, Res & Innovat, Southampton SO16 4GU, Hants, England
[4] Bournemouth Univ, Sch Conservat Sci, Poole BH12 5BB, Dorset, England
关键词
classsification; training set; support vector machine (SVM); support vector data description (SVDD);
D O I
10.1016/j.rse.2006.03.004
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The design of the training stage of a supervised classification should account for the properties of the classifier to be used. Consideration of the way the classifier operates may enable the training stage to be designed in a manner which ensures that the aim of the classification is satisfied with the use of a small, inexpensive, training set. It may, therefore, be possible to reduce the training set size requirements from that generally expected with the use of standard heuristics. Substantial reductions in training set size may be possible if interest is focused on a single class. This is illustrated for mapping cotton in north-western India by support vector machine type classifiers. Four approaches to reducing training set size were used: intelligent selection of the most informative training samples, selective class exclusion, acceptance of imprecise descriptions for spectrally distinct classes and the adoption of a one-class classifier. All four approaches were able to reduce the training set size required considerably below that suggested by conventional widely used heuristics without significant impact on the accuracy with which the class of interest was classified. For example, reductions in training set size of similar to 90% from that suggested by a conventional heuristic are reported with the accuracy of cotton classification remaining nearly constant at similar to 95% and similar to 97% from the user's and producer's perspectives respectively. (c) 2006 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 53 条
[1]   A flexible classification approach with optimal generalisation performance: support vector machines [J].
Belousov, AI ;
Verzakov, SA ;
von Frese, J .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 64 (01) :15-25
[2]  
BOYD DS, IN PRESS INT J REMOT
[3]   Linear spectral mixture models and support vector machines for remote sensing [J].
Brown, M ;
Lewis, HG ;
Gunn, SR .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2000, 38 (05) :2346-2360
[4]   A semilabeled-sample-driven bagging technique for Ill-posed classification problems [J].
Chi, MM ;
Bruzzone, L .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2005, 2 (01) :69-73
[5]  
Collado AD, 2002, J ARID ENVIRON, V52, P121, DOI [10.1006/jare.2001.0980, 10.1016/S0140-1963(01)90980-2]
[6]  
Congalton R. G., 2009, ASSESSING ACCURACY R
[7]   A comparison of error metrics and constraints for multiple endmember spectral mixture analysis and spectral angle mapper [J].
Dennison, PE ;
Halligan, KQ ;
Roberts, DA .
REMOTE SENSING OF ENVIRONMENT, 2004, 93 (03) :359-367
[8]   Forest cover change in the Toledo District, Belize from 1975 to 1999: A remote sensing approach [J].
Emch, M ;
Quinn, JW ;
Peterson, M ;
Alexander, M .
PROFESSIONAL GEOGRAPHER, 2005, 57 (02) :256-267
[9]   The use of small training sets containing mixed pixels for accurate hard image classification: Training on mixed spectral responses for classification by a SVM [J].
Foody, Giles M. ;
Mathur, Ajay .
REMOTE SENSING OF ENVIRONMENT, 2006, 103 (02) :179-189
[10]   An evaluation of some factors affecting the accuracy of classification by an artificial neural network [J].
Foody, GM ;
Arora, MK .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 1997, 18 (04) :799-810