Breakdown Point of Robust Support Vector Machines

被引:8
作者
Kanamori, Takafumi [1 ,4 ]
Fujiwara, Shuhei [2 ]
Takeda, Akiko [3 ,4 ]
机构
[1] Nagoya Univ, Dept Comp Sci & Math Informat, Nagoya, Aichi 4648601, Japan
[2] TOPGATE Co Ltd, Bunkyo Ku, Tokyo 1130033, Japan
[3] Inst Stat Math, Tokyo 1908562, Japan
[4] RIKEN Ctr Adv Intelligence Project, Tokyo 1030027, Japan
关键词
support vector machine; breakdown point; outlier; kernel function; RISK; CLASSIFICATION; MARGIN;
D O I
10.3390/e19020083
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Support vector machine (SVM) is one of the most successful learning methods for solving classification problems. Despite its popularity, SVM has the serious drawback that it is sensitive to outliers in training samples. The penalty on misclassification is defined by a convex loss called the hinge loss, and the unboundedness of the convex loss causes the sensitivity to outliers. To deal with outliers, robust SVMs have been proposed by replacing the convex loss with a non-convex bounded loss called the ramp loss. In this paper, we study the breakdown point of robust SVMs. The breakdown point is a robustness measure that is the largest amount of contamination such that the estimated classifier still gives information about the non-contaminated data. The main contribution of this paper is to show an exact evaluation of the breakdown point of robust SVMs. For learning parameters such as the regularization parameter, we derive a simple formula that guarantees the robustness of the classifier. When the learning parameters are determined with a grid search using cross-validation, our formula works to reduce the number of candidate search points. Furthermore, the theoretical findings are confirmed in numerical experiments. We show that the statistical properties of robust SVMs are well explained by a theoretical analysis of the breakdown point.
引用
收藏
页数:27
相关论文
共 41 条
  • [1] The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems
    An, LTH
    Tao, PD
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 133 (1-4) : 23 - 46
  • [2] [Anonymous], 2006, Proceedings of the 23rd International Conference on Machine Learning
  • [3] Convexity, classification, and risk bounds
    Bartlett, PL
    Jordan, MI
    McAuliffe, JD
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 138 - 156
  • [4] Berlinet A., 2004, Reproducing Kernel Hilbert Spaces in Probability and Statistics, DOI 10.1007/978-1-4419-9096-9
  • [5] Boyd S., 2004, Convex optimization, DOI [10.1017/cbo97805118044 41, 10.1017/CBO9780511804441]
  • [6] Christmann A, 2004, J MACH LEARN RES, V5, P1007
  • [7] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
  • [8] Crisp DJ, 2000, ADV NEUR IN, V12, P244
  • [9] Donoho D. L., 1983, A Festschrift for Erich Lehmann
  • [10] Robust Support Vector Machines for Classification with Nonconvex and Smooth Losses
    Feng, Yunlong
    Yang, Yuning
    Huang, Xiaolin
    Mehrkanoon, Siamak
    Suykens, Johan A. K.
    [J]. NEURAL COMPUTATION, 2016, 28 (06) : 1217 - 1247