Data representations and generalization error in kernel based learning machines

被引:21
作者
Ancona, Nicola [1 ]
Maglietta, Rosalia [1 ]
Stella, Ettore [1 ]
机构
[1] CNR, Ist Studi Sistemi Intelligenti Automaz, I-70126 Bari, Italy
基金
美国国家科学基金会;
关键词
supervised learning; classification; support vector machines; generalization; leave-one-out error; sparse and dense data representation;
D O I
10.1016/j.patcog.2005.11.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on the problem of how data representation influences the generalization error of kernel based learning machines like support vector machines (SVM) for classification. Frame theory provides a well founded mathematical framework for representing data in many different ways. We analyze the effects of sparse and dense data representations on the generalization error of such learning machines measured by using leave-one-out error given a finite amount of training data. We show that, in the case of sparse data representations, the generalization error of an SVM trained by using polynomial or Gaussian kernel functions is equal to the one of a linear SVM. This is equivalent to saying that the capacity of separating points of functions belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduces to the capacity of a separating hyperplane in the input space. Moreover, we show that, in general, sparse data representations increase or leave unchanged the generalization error of kernel based methods. Dense data representations, on the contrary, reduce the generalization error in the case of very large frames. We use two different schemes for representing data in overcomplete systems of Haar and Gabor functions, and measure SVM generalization error on benchmarked data sets. (c) 2006 Pattern Recognition Soeiety. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1588 / 1603
页数:16
相关论文
共 19 条
[1]   Ball detection in static images with Support Vector - Machines for classification [J].
Ancona, N ;
Cicirelli, G ;
Stella, E ;
Distante, A .
IMAGE AND VISION COMPUTING, 2003, 21 (08) :675-692
[2]  
ANCONA N, 2003, 022003 RIIESICNR
[3]  
[Anonymous], CBMS NSF REGIONAL C
[4]  
BERGEAUD F, 1996, COMPUT APPL MATH, V15
[5]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[6]   FACE RECOGNITION - FEATURES VERSUS TEMPLATES [J].
BRUNELLI, R ;
POGGIO, T .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (10) :1042-1052
[7]  
CHEN S, 1995, 479 STANF U DEP STAT
[8]  
Courant R., 1989, Methods of Mathematical Physics, VI
[9]  
HEISELE B, 2000, 1687 MIT ART INT LAB
[10]  
Kolmogorov A. N., 1970, Introductory real analysis