Machine learning with data dependent hypothesis classes

被引：29

作者：

Cannon, A ^{[1
]}

Ettinger, JM

Hush, D

Scovel, C

机构：

[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA

[2] Los Alamos Natl Lab, Nonproliferat & Int Secur Grp, NIS8, Los Alamos, NM 87545 USA

[3] Los Alamos Natl Lab, Modeling Algorithms & Informat Grp, CCS3, Los Alamos, NM 87545 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2002年 / 2卷 / 03期

关键词：

computational learning theory; empirical process theory; classification; shatter coefficient; structural risk minimization;

D O I：

10.1162/153244302760200650

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We extend the VC theory of statistical learning to data dependent spaces of classifiers. This theory can be viewed as a decomposition of classifier design into two components; the first component is a restriction to a data dependent hypothesis class and the second is empirical risk minimization within that class. We define a measure of complexity for data dependent hypothesis classes and provide data dependent versions of bounds on error deviance and estimation error. We also provide a structural risk minimization procedure over data dependent hierarchies and prove consistency. We use this theory to provide a framework for studying the trade-offs between performance and computational complexity in classifier design. As a consequence we obtain a new family of classifiers with dimension independent performance bounds and efficient learning procedures.

引用

页码：335 / 358

页数：24