Multiclass classification with bandit feedback using adaptive regularization

被引:41
作者
Crammer, Koby [1 ]
Gentile, Claudio [2 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
[2] Univ Insubria, DICOM, I-21100 Varese, Italy
关键词
Online learning; Upper confidence bound; Regret; PERCEPTRON;
D O I
10.1007/s10994-012-5321-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new multiclass algorithm in the bandit framework, where after making a prediction, the learning algorithm receives only partial feedback, i.e., a single bit indicating whether the predicted label is correct or not, rather than the true label. Our algorithm is based on the second-order Perceptron, and uses upper-confidence bounds to trade-off exploration and exploitation, instead of random sampling as performed by most current algorithms. We analyze this algorithm in a partial adversarial setting, where instances are chosen adversarially, while the labels are chosen according to a linear probabilistic model which is also chosen adversarially. We show a regret of , which improves over the current best bounds of in the fully adversarial setting. We evaluate our algorithm on nine real-world text classification problems and on four vowel recognition tasks, often obtaining state-of-the-art results, even compared with non-bandit online algorithms, especially when label noise is introduced.
引用
收藏
页码:347 / 383
页数:37
相关论文
共 31 条
[1]  
Auer P., 2003, Journal of Machine Learning Research, V3, P397, DOI 10.1162/153244303321897663
[2]   Relative loss bounds for on-line density estimation with the exponential family of distributions [J].
Azoury, KS ;
Warmuth, MK .
MACHINE LEARNING, 2001, 43 (03) :211-246
[3]  
Blitzer J., 2007, BIOGRAPHICS BOLLYWOO
[4]   Second-order perceptron algorithm [J].
Cesa-Bianchi, N ;
Conconi, A ;
Gentile, C .
SIAM JOURNAL ON COMPUTING, 2005, 34 (03) :640-668
[5]  
Cesa-Bianchi N., 2009, P 26 ICML
[6]   Ultraconservative online algorithms for multiclass problems [J].
Crammer, K ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :951-991
[7]   On the learnability and design of output codes for multiclass problems [J].
Crammer, K ;
Singer, Y .
MACHINE LEARNING, 2002, 47 (2-3) :201-233
[8]  
Crammer K., 2009, NIPS 2009
[9]  
Crammer K., 2009, EMNLP 2009
[10]  
Dani V., 2008, COLT 2008