Online learning from finite training sets and robustness to input bias

被引:9
作者
Sollich, P
Barber, D
机构
[1] Univ Edinburgh, Dept Phys, Edinburgh EH9 3JZ, Midlothian, Scotland
[2] Univ Nijmegen, SNN, Real World Comp Partnership Theoret Fdn, NL-6525 EZ Nijmegen, Netherlands
关键词
D O I
10.1162/089976698300017034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We analyze online gradient descent learning from finite training sets at noninfinitesimal learning rates eta. Exact results are obtained for the time-dependent generalization error of a simple model system: a linear network with a large number of weights N, trained on p = alpha N examples. This allows us to study in detail the effects of finite training set size alpha on, for example, the optimal choice of learning rate eta. We also compare online and offline learning, for respective optimal settings of eta at given final learning time. Online learning turns out to be much more robust to input bias and actually outperforms offline learning when such bias is present; for unbiased inputs, online and offline learning perform almost equally well.
引用
收藏
页码:2201 / 2217
页数:17
相关论文
共 15 条
[1]   LEARNING BY ONLINE GRADIENT DESCENT [J].
BIEHL, M ;
SCHWARZE, H .
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1995, 28 (03) :643-656
[2]  
Halkjaer S, 1997, ADV NEUR IN, V9, P169
[3]   PHASE-TRANSITIONS IN SIMPLE LEARNING [J].
HERTZ, JA ;
KROGH, A ;
THORBERGSSON, GI .
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1989, 22 (12) :2133-2150
[4]   A theoretical comparison of batch-mode, on-line, cyclic, and almost-cyclic learning [J].
Heskes, T ;
Wiegerinck, W .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (04) :919-925
[5]   LEARNING-PROCESSES IN NEURAL NETWORKS [J].
HESKES, TM ;
KAPPEN, B .
PHYSICAL REVIEW A, 1991, 44 (04) :2718-2726
[6]   GENERALIZATION IN A LINEAR PERCEPTRON IN THE PRESENCE OF NOISE [J].
KROGH, A ;
HERTZ, JA .
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1992, 25 (05) :1135-1147
[7]  
LECUN Y, 1991, PHYS REV LETT, V66, P2396, DOI 10.1103/PhysRevLett.66.2396
[8]  
Luo Z.-Q., 1991, NEURAL COMPUT, V3, P226
[9]   ONLINE LEARNING IN SOFT COMMITTEE MACHINES [J].
SAAD, D ;
SOLLA, SA .
PHYSICAL REVIEW E, 1995, 52 (04) :4225-4243
[10]  
Sollich P, 1997, ADV NEUR IN, V9, P274