Multiclass classification with bandit feedback using adaptive regularization

被引：41

作者：

Crammer, Koby ^{[1
]}

Gentile, Claudio ^{[2
]}

机构：

[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel

[2] Univ Insubria, DICOM, I-21100 Varese, Italy

来源：

MACHINE LEARNING | 2013年 / 90卷 / 03期

关键词：

Online learning; Upper confidence bound; Regret; PERCEPTRON;

D O I：

10.1007/s10994-012-5321-8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a new multiclass algorithm in the bandit framework, where after making a prediction, the learning algorithm receives only partial feedback, i.e., a single bit indicating whether the predicted label is correct or not, rather than the true label. Our algorithm is based on the second-order Perceptron, and uses upper-confidence bounds to trade-off exploration and exploitation, instead of random sampling as performed by most current algorithms. We analyze this algorithm in a partial adversarial setting, where instances are chosen adversarially, while the labels are chosen according to a linear probabilistic model which is also chosen adversarially. We show a regret of , which improves over the current best bounds of in the fully adversarial setting. We evaluate our algorithm on nine real-world text classification problems and on four vowel recognition tasks, often obtaining state-of-the-art results, even compared with non-bandit online algorithms, especially when label noise is introduced.

引用

页码：347 / 383

页数：37

共 31 条

[1]

Auer P., 2003, Journal of Machine Learning Research, V3, P397, DOI 10.1162/153244303321897663

[2] Relative loss bounds for on-line density estimation with the exponential family of distributions [J].

Azoury, KS ;

Warmuth, MK .

MACHINE LEARNING, 2001, 43 (03) :211-246

[3]

Blitzer J., 2007, BIOGRAPHICS BOLLYWOO

[4] Second-order perceptron algorithm [J].

Cesa-Bianchi, N ;

Conconi, A ;

Gentile, C .

SIAM JOURNAL ON COMPUTING, 2005, 34 (03) :640-668

[5]

Cesa-Bianchi N., 2009, P 26 ICML

[6] Ultraconservative online algorithms for multiclass problems [J].

Crammer, K ;

Singer, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :951-991

[7] On the learnability and design of output codes for multiclass problems [J].

Crammer, K ;

Singer, Y .

MACHINE LEARNING, 2002, 47 (2-3) :201-233

[8]

Crammer K., 2009, NIPS 2009

[9]

Crammer K., 2009, EMNLP 2009

[10]

Dani V., 2008, COLT 2008

← 1 2 3 4 →