Methods to speed up error back-propagation learning algorithm

被引:64
作者
Sarkar, D
机构
关键词
adaptive learning rate; artificial neural networks (ANNs); conjugate gradient method; energy function; error back-propagation learning; feedforward networks; learning rate; momentum; oscillation of weights; training set size;
D O I
10.1145/234782.234785
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Error back propagation (EBP) is now the most used training algorithm for feedforward artificial neural networks (FFANNs). However, it is generally believed that it is very slow if it does converge, especially if the network. size is not too large compared to the problem at hand. The main problem with the EBP algorithm is that it has a constant learning rate coefficient, and different regions of the error surface may have different characteristic gradients that may require a dynamic change of learning rate coefficient based on the nature of the surface. Also, the characteristic of the error surface may be unique in every dimension, which may require one learning rate coefficient for each weight. To overcome these problems several modifications have been suggested. This survey is an attempt to present them together and to compare them. The first modification was momentum strategy where a fraction of the last weight correction is added to the currently suggested weight correction. It has both an accelerating and a decelerating effect where they are necessary. However, this method can give only a relatively small dynamic range for the learning rate coefficient. To increase the dynamic range of the learning rate coefficient, such methods as the ''bold driver'' and SAB (self-adaptive back propagation) were proposed. A modification to the SAB that eliminates the requirement of selection of a ''good'' learning rate coefficient by the user gave the SuperSAB. A slight modification to the momentum strategy produced a new method that controls the oscillation of weights to speed up learning. Modification to the EBP algorithm in which the gradients are rescaled at every layer helped to improve the performance. Use of ''expected output'' of a neuron instead of actual output for correcting weights improved performance of the momentum strategy. The conjugate gradient method and ''self-determination of adaptive learning rate'' require no learning rate coefficient from the user. Use of energy functions other than the sum of the squared error has shown improved convergence rate. An effective learning rate coefficient selection needs to consider the size of the training set. All these methods to improve the performance of the EBP algorithm are presented here.
引用
收藏
页码:519 / 542
页数:24
相关论文
共 36 条
  • [1] AHMAD M, 1992, P 4 INT C SYST ENG
  • [2] AHMAD M, 1992, P INT JOINT C NEURAL
  • [3] Ahmad M., 1992, P INT C FUZZ LOG NEU
  • [4] ANAND R, 1993, IEEE T NEURAL NETW, V4
  • [5] BATTITI R, 1992, NEURAL COMPUT, V4
  • [6] BATTITI R, 1989, COMPLEX SYST, V3
  • [7] BRYSON A, 1969, APPLIED OPTIMAL CONT
  • [8] DEVOS MR, 1988, P NEURONIMES
  • [9] EATON HAC, 1992, NEURAL NETW, V5
  • [10] Fahlman SE, 1988, EMPIRICAL STUDY LEAR