Consistent algorithms for multiclass classification with an abstain option

被引：41

作者：

Ramaswamy, Harish G. ^{[1
]}

Tewari, Ambuj ^{[2
,3
]}

Agarwal, Shivani ^{[4
]}

机构：

[1] Indian Inst Technol Madras, Dept Comp Sci & Engn, Madras, Tamil Nadu, India

[2] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA

[3] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA

[4] Univ Penn, Dept Comp & Informat Sci, 200 S 33Rd St, Philadelphia, PA 19104 USA

来源：

ELECTRONIC JOURNAL OF STATISTICS | 2018年 / 12卷 / 01期

关键词：

Machine learning; classification; statistical consistency; surrogate loss; calibration; abstain loss; SUPPORT VECTOR MACHINES; REJECT OPTION; ERROR;

D O I：

10.1214/17-EJS1388

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

We consider the problem of n-class classification (n >= 2), where the classifier can choose to abstain from making predictions at a given cost, say, a factor alpha of the cost of misclassification. Our goal is to design consistent algorithms for such n-class classification problems with a 'reject option'; while such algorithms are known for the binary (n = 2) case, little has been understood for the general multiclass case. We show that the well known Crammer-Singer surrogate and the one-vs-all hinge loss, albeit with a different predictor than the standard argmax, yield consistent algorithms for this problem when alpha = 1/2. More interestingly, we design a new convex surrogate, which we call the binary encoded predictions surrogate, that is also consistent for this problem when alpha = 1/2 and operates on a much lower dimensional space (log(n) as opposed to n). We also construct modified versions of all these three surrogates to be consistent for any given alpha is an element of [0, 1/2].

引用

页码：530 / 554

页数：25

共 31 条

[1]

Allwein E., 2002, JMLR, V1, P113

[2]

[Anonymous], 1999, Advances in kernel methods: Support vector learning

[3]

[Anonymous], ADV NEURAL INFORM PR

[4]

[Anonymous], 2001, Journal of Machine Learning Research

[5]

Bartlett PL, 2008, J MACH LEARN RES, V9, P1823

[6]

CHOW CK, 1970, IEEE T INFORM THEORY, V16, P41, DOI 10.1109/TIT.1970.1054406

[7]

Cortes C., 2016, ALGORITHMIC LEARNING

[8]

Duchi J., 2008, ICML 08, P272

[9]

El-Yaniv R, 2010, J MACH LEARN RES, V11, P1605

[10] Reject option with multiple thresholds [J].

Fumera, G ;

Roli, F ;

Giacinto, G .

PATTERN RECOGNITION, 2000, 33 (12) :2099-2101

← 1 2 3 4 →