Dynamics of batch training in a perceptron

被引:7
作者
Bos, S [1 ]
Opper, M
机构
[1] RIKEN, Lab Informat Synth, Brain Sci Inst, Wako, Saitama 35101, Japan
[2] Univ Wurzburg, D-97074 Wurzburg, Germany
来源
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL | 1998年 / 31卷 / 21期
关键词
D O I
10.1088/0305-4470/31/21/004
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Early stopping and weight decay are studied in a linear perceptron using a new simplified approach to the dynamics in the thermodynamical limit. The approach is directly deduced from the gradient descent weight update. It allows an exact description of the dynamics of the batch training process. The results are compared with a recent study of early stopping and weight decay based on the equilibrium statistical mechanics approach. It is shown that the equilibrium results for early stopping are good approximations and are exact for weight decay. Furthermore, in the dynamical approach it is possible to determine the necessary number of training steps to fulfil certain termination conditions. It can be shown that asymptotically, i.e. if the number of examples is large, only two batch steps are required to reach the optimal convergence. if the learning rate is optimally chosen.
引用
收藏
页码:4835 / 4850
页数:16
相关论文
共 33 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   4 TYPES OF LEARNING-CURVES [J].
AMARI, S ;
FUJITA, N ;
SHINOMOTO, S .
NEURAL COMPUTATION, 1992, 4 (04) :605-618
[3]   Asymptotic statistical theory of overtraining and cross-validation [J].
Amari, S ;
Murata, N ;
Muller, KR ;
Finke, M ;
Yang, HH .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (05) :985-996
[4]  
[Anonymous], 1992, NIPS 91 P 4 INT C NE
[5]   TEST ERROR FLUCTUATIONS IN FINITE LINEAR PERCEPTRONS [J].
BARBER, D ;
SAAD, D ;
SOLLICH, P .
NEURAL COMPUTATION, 1995, 7 (04) :809-821
[6]   LEARNING FROM NOISY DATA - AN EXACTLY SOLVABLE MODEL [J].
BIEHL, M ;
RIEGLER, P ;
STECHERT, M .
PHYSICAL REVIEW E, 1995, 52 (05) :R4624-R4627
[7]  
Bos S, 1997, ADV NEUR IN, V9, P141
[8]  
Bos S, 1996, ADV NEUR IN, V8, P218
[9]   GENERALIZATION ABILITY OF PERCEPTRONS WITH CONTINUOUS OUTPUTS [J].
BOS, S ;
KINZEL, W ;
OPPER, M .
PHYSICAL REVIEW E, 1993, 47 (02) :1384-1391
[10]  
BOS S, 1995, P 1995 INT S NONL TH, P385