General convergence results for linear discriminant updates

被引:59
作者
Grove, AJ [1 ]
Littlestone, N [1 ]
Schuurmans, D [1 ]
机构
[1] NEC Res Inst, Princeton, NJ 08540 USA
关键词
Winnow; Perceptron; on-line learning; mistake bounds; Bregman divergence;
D O I
10.1023/A:1010844028087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of learning linear-discriminant concepts can be solved by various mistake-driven update procedures, including the Winnow family of algorithms and the well-known Perceptron algorithm. In this paper we define the general class of "quasi-additive" algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers a broad subset of algorithms in this class, including both Perceptron and Winnow, but also many new algorithms. Our proof hinges on analyzing a generic measure of progress construction that gives insight as to when and how such algorithms converge. Our measure of progress construction also permits us to obtain good mistake bounds for individual algorithms. We apply our unified analysis to new algorithms as well as existing algorithms. When applied to known algorithms, our method "automatically" produces close variants of existing proofs (recovering similar bounds)-thus showing that, in a certain sense, these seemingly diverse results are fundamentally isomorphic. However, we also demonstrate that the unifying principles are more broadly applicable, and analyze a new class of algorithms that smoothly interpolate between the additive-update behavior of Perceptron and the multiplicative-update behavior of Winnow.
引用
收藏
页码:173 / 210
页数:38
相关论文
共 32 条
[1]  
[Anonymous], [No title captured]
[2]  
Azoury KS, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P31
[3]   PERCEPTRON - A MODEL FOR BRAIN FUNCTIONING .1. [J].
BLOCK, HD .
REVIEWS OF MODERN PHYSICS, 1962, 34 (01) :123-&
[4]   Empirical support for Winnow and Weighted-Majority algorithms: Results on a calendar scheduling domain [J].
Blum, A .
MACHINE LEARNING, 1997, 26 (01) :5-23
[5]  
Bregman LM, 1967, USSR Computational Mathematics and Mathematical Physics, V7, P200
[6]  
Censor Y, 1997, PARALLEL OPTIMIZATIO
[7]  
Cesa-Bianchi N., 1996, Proceedings of the Ninth Annual Conference on Computational Learning Theory, P314, DOI 10.1145/238061.238162
[8]   How to use expert advice [J].
CesaBianchi, N ;
Freund, Y ;
Haussler, D ;
Helmbold, DP ;
Schapire, RE ;
Warmuth, MK .
JOURNAL OF THE ACM, 1997, 44 (03) :427-485
[9]  
DAGAN I, 1997, P 2 C EMP METH NAT L, P55
[10]  
Ellis R., 2006, ENTROPY LARGE DEVIAT