Machine learning with data dependent hypothesis classes

被引:29
作者
Cannon, A [1 ]
Ettinger, JM
Hush, D
Scovel, C
机构
[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
[2] Los Alamos Natl Lab, Nonproliferat & Int Secur Grp, NIS8, Los Alamos, NM 87545 USA
[3] Los Alamos Natl Lab, Modeling Algorithms & Informat Grp, CCS3, Los Alamos, NM 87545 USA
关键词
computational learning theory; empirical process theory; classification; shatter coefficient; structural risk minimization;
D O I
10.1162/153244302760200650
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We extend the VC theory of statistical learning to data dependent spaces of classifiers. This theory can be viewed as a decomposition of classifier design into two components; the first component is a restriction to a data dependent hypothesis class and the second is empirical risk minimization within that class. We define a measure of complexity for data dependent hypothesis classes and provide data dependent versions of bounds on error deviance and estimation error. We also provide a structural risk minimization procedure over data dependent hierarchies and prove consistency. We use this theory to provide a framework for studying the trade-offs between performance and computational complexity in classifier design. As a consequence we obtain a new family of classifiers with dimension independent performance bounds and efficient learning procedures.
引用
收藏
页码:335 / 358
页数:24
相关论文
共 31 条
[1]  
AMBARTZUMIAN RV, 1990, FACTORIZATION CALCUL
[2]  
[Anonymous], COLT
[3]  
[Anonymous], 1997, APPROXIMATION ALGORI
[4]  
Ash R. B., 1972, REAL ANAL PROBABILIT, DOI DOI 10.1016/C2013-0-06164-6
[5]  
BARTLETT P, 2000, MODEL SELECTION ERRO
[6]  
Boucheron S, 2000, RANDOM STRUCT ALGOR, V16, P277, DOI 10.1002/(SICI)1098-2418(200005)16:3<277::AID-RSA4>3.0.CO
[7]  
2-1
[8]   Learning by canonical smooth estimation .2. Learning and choice of model complexity [J].
Buescher, KL ;
Kumar, PR .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1996, 41 (04) :557-569
[9]  
CANNON AH, 2000, P 6 INT S MATH ART I
[10]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411