Adaptive regularization of weight vectors

被引:181
作者
Crammer, Koby [1 ]
Kulesza, Alex [2 ]
Dredze, Mark [3 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
[2] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA
[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21211 USA
关键词
Online learning; Supervised learning; Text classification; Adaptive regularization; PERCEPTRON; MARGIN; NOISE;
D O I
10.1007/s10994-013-5327-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present AROW, an online learning algorithm for binary and multiclass problems that combines large margin training, confidence weighting, and the capacity to handle non-separable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform especially well in the presence of label noise. We derive mistake bounds for the binary and multiclass settings that are similar in form to the second order perceptron bound. Our bounds do not assume separability. We also relate our algorithm to recent confidence-weighted online learning techniques. Empirical evaluations show that AROW achieves state-of-the-art performance on a wide range of binary and multiclass tasks, as well as robustness in the face of non-separable data.
引用
收藏
页码:155 / 187
页数:33
相关论文
共 61 条
[11]  
Chechik G, 2010, J MACH LEARN RES, V11, P1109
[12]  
Chiang D., 2008, P C EMP METH NAT LAN, P224
[13]  
Collins M, 2002, PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P1
[14]  
Crammer K, 2006, J MACH LEARN RES, V7, P551
[15]   Ultraconservative online algorithms for multiclass problems [J].
Crammer, K ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :951-991
[16]  
Crammer K., 2010, IEEE INT C AC SPEECH
[17]   Adaptive regularization of weight vectors [J].
Crammer, Koby ;
Kulesza, Alex ;
Dredze, Mark .
MACHINE LEARNING, 2013, 91 (02) :155-187
[18]  
Crammer Koby., 2003, Advances in Neural Information Processing Systems, V16
[19]  
Crammer Koby, 2009, EMPIRICAL METHODS NA
[20]  
Crammer Koby, 2010, ADV NEURAL INFORM PR, V24