Maximum Relative Margin and Data-Dependent Regularization

被引:0
作者
Shivaswamy, Pannagadatta K. [1 ]
Jebara, Tony [1 ]
机构
[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
support vector machines; kernel methods; large margin; Rademacher complexity;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Leading classification methods such as support vector machines (SVMs) and their counterparts achieve strong generalization performance by maximizing the margin of separation between data classes. While the maximum margin approach has achieved promising performance, this article identifies its sensitivity to affine transformations of the data and to directions with large data spread. Maximum margin solutions may be misled by the spread of data and preferentially separate classes along large spread directions. This article corrects these weaknesses by measuring margin not in the absolute sense but rather only relative to the spread of data in any projection direction. Maximum relative margin corresponds to a data-dependent regularization on the classification function while maximum absolute margin corresponds to an l(2) norm constraint on the classification function. Interestingly, the proposed improvements only require simple extensions to existing maximum margin formulations and preserve the computational efficiency of SVMs. Through the maximization of relative margin, surprising performance gains are achieved on real-world problems such as digit, text classification and on several other benchmark data sets. In addition, risk bounds are derived for the new formulation based on Rademacher averages.
引用
收藏
页码:747 / 788
页数:42
相关论文
共 42 条
[1]  
[Anonymous], INT C MACH LEARN
[2]  
[Anonymous], ARTIFICIAL INTELLIGE
[3]  
[Anonymous], 2004, KERNEL METHODS PATTE
[4]  
[Anonymous], 2003, CONVEX OPTIMIZATION
[5]  
[Anonymous], 1999, P 16 INT C MACH LEAR
[6]  
[Anonymous], 2006, P ACMSIGKDD INT C KN
[7]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[8]  
[Anonymous], 2007, Uci machine learning repository
[9]  
[Anonymous], 2008, NEURAL INFORM PROCES
[10]  
Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690