INITIALIZING WEIGHTS OF A MULTILAYER PERCEPTRON NETWORK BY USING THE ORTHOGONAL LEAST-SQUARES ALGORITHM

被引:27
作者
LEHTOKANGAS, M [1 ]
SAARINEN, J [1 ]
KASKI, K [1 ]
HUUHTANEN, P [1 ]
机构
[1] UNIV TAMPERE,DEPT MATH SCI,SF-33101 TAMPERE,FINLAND
关键词
D O I
10.1162/neco.1995.7.5.982
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Usually the training of a multilayer perceptron network starts by initializing the network weights with small random values, and then the weight adjustment: is carried out by using an iterative gradient descent-based optimization routine called backpropagation training. If the random initial weights happen to be far from a good solution or they are near a poor local optimum, the training will take a lot of time since many iteration steps are required. Furthermore, it is very possible that the network will not converge to an adequate solution at all, On the other hand, if the initial weights are close to a good solution the training will be much faster and the possibility of obtaining adequate convergence increases. In this paper a new method for initializing the weights is presented. The method is based on the orthogonal least squares algorithm. The simulation results obtained with the proposed initialization method show a considerable improvement in training compared to the randomly initialized networks. In light of practical experiments, the proposed method has proven to be fast and useful for initializing the network weights.
引用
收藏
页码:982 / 999
页数:18
相关论文
共 16 条
[1]  
Burrows T. L., 1993, CUEDFINFENGTR158 CAM
[2]   ORTHOGONAL LEAST-SQUARES LEARNING ALGORITHM FOR RADIAL BASIS FUNCTION NETWORKS [J].
CHEN, S ;
COWAN, CFN ;
GRANT, PM .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (02) :302-309
[3]   STATISTICALLY CONTROLLED ACTIVATION WEIGHT INITIALIZATION (SCAWI) [J].
DRAGO, GP ;
RIDELLA, S .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (04) :627-631
[4]  
FAHLMAN SE, 1988, CMUCS88162 CARN MELL
[5]   INCREASED RATES OF CONVERGENCE THROUGH LEARNING RATE ADAPTATION [J].
JACOBS, RA .
NEURAL NETWORKS, 1988, 1 (04) :295-307
[6]  
KIM L, 1993, P INT JOINT C NEUR N, V2, P1701
[7]  
LI G, 1993, P IEEE INT C NEUR NE, V1, P580
[8]  
PFISTER M, 1993, P INT JOINT C NEUR N, V1, P517
[9]  
RISSANEN J, 1994, MATH PERSPECTIVES NE
[10]  
Rissanen J., 1989, SERIES COMPUTER SCI, V15