Covering number bounds of certain regularized linear function classes

被引:141
作者
Zhang, T [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
covering numbers; learning sample complexity; sparse approximation; mistake bounds;
D O I
10.1162/153244302760200713
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, sample complexity bounds have been derived for problems involving linear functions such as neural networks and support vector machines. In many of these theoretical studies, the concept of covering numbers played an important role. It is thus useful to study covering numbers for linear function classes. In this paper, we investigate two closely related methods to derive upper bounds on these covering numbers. The first method, already employed in some earlier studies, relies on the so-called Maurey's lemma; the second method uses techniques from the mistake bound framework in online learning. We compare results from these two methods, as well as their consequences in some learning formulations.
引用
收藏
页码:527 / 550
页数:24
相关论文
共 38 条
[1]   Scale-sensitive dimensions, uniform convergence, and learnability [J].
Alon, N ;
BenDavid, S ;
CesaBianchi, N ;
Haussler, D .
JOURNAL OF THE ACM, 1997, 44 (04) :615-631
[2]  
[Anonymous], 1984, LECT NOTES MATH
[3]  
Anthony M., 1999, Neural network learning: theoretical foundations, Vfirst
[4]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[5]  
Bartlett P, 1999, ADVANCES IN KERNEL METHODS, P43
[6]   The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].
Bartlett, PL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536
[7]  
Cristianini N, 2000, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods
[8]  
Devroye L., 1996, A probabilistic theory of pattern recognition
[9]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[10]  
GENTILE C, 1998, P NIPS 98