Consistent algorithms for multiclass classification with an abstain option

被引:41
作者
Ramaswamy, Harish G. [1 ]
Tewari, Ambuj [2 ,3 ]
Agarwal, Shivani [4 ]
机构
[1] Indian Inst Technol Madras, Dept Comp Sci & Engn, Madras, Tamil Nadu, India
[2] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA
[4] Univ Penn, Dept Comp & Informat Sci, 200 S 33Rd St, Philadelphia, PA 19104 USA
关键词
Machine learning; classification; statistical consistency; surrogate loss; calibration; abstain loss; SUPPORT VECTOR MACHINES; REJECT OPTION; ERROR;
D O I
10.1214/17-EJS1388
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of n-class classification (n >= 2), where the classifier can choose to abstain from making predictions at a given cost, say, a factor alpha of the cost of misclassification. Our goal is to design consistent algorithms for such n-class classification problems with a 'reject option'; while such algorithms are known for the binary (n = 2) case, little has been understood for the general multiclass case. We show that the well known Crammer-Singer surrogate and the one-vs-all hinge loss, albeit with a different predictor than the standard argmax, yield consistent algorithms for this problem when alpha = 1/2. More interestingly, we design a new convex surrogate, which we call the binary encoded predictions surrogate, that is also consistent for this problem when alpha = 1/2 and operates on a much lower dimensional space (log(n) as opposed to n). We also construct modified versions of all these three surrogates to be consistent for any given alpha is an element of [0, 1/2].
引用
收藏
页码:530 / 554
页数:25
相关论文
共 31 条
[1]  
Allwein E., 2002, JMLR, V1, P113
[2]  
[Anonymous], 1999, Advances in kernel methods: Support vector learning
[3]  
[Anonymous], ADV NEURAL INFORM PR
[4]  
[Anonymous], 2001, Journal of Machine Learning Research
[5]  
Bartlett PL, 2008, J MACH LEARN RES, V9, P1823
[6]  
CHOW CK, 1970, IEEE T INFORM THEORY, V16, P41, DOI 10.1109/TIT.1970.1054406
[7]  
Cortes C., 2016, ALGORITHMIC LEARNING
[8]  
Duchi J., 2008, ICML 08, P272
[9]  
El-Yaniv R, 2010, J MACH LEARN RES, V11, P1605
[10]   Reject option with multiple thresholds [J].
Fumera, G ;
Roli, F ;
Giacinto, G .
PATTERN RECOGNITION, 2000, 33 (12) :2099-2101