A comparison of nonlinear optimization methods for supervised learning in multilayer feedforward neural networks

被引:14
作者
Denton, JW [1 ]
Hung, MS [1 ]
机构
[1] KENT STATE UNIV,COLL BUSINESS ADM,KENT,OH 44242
关键词
neural network training; nonlinear programming;
D O I
10.1016/0377-2217(96)00035-5
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
One impediment to the use of neural networks in pattern classification problems is the excessive time required for supervised learning in larger multilayer feedforward networks. The use of nonlinear optimization techniques to perform neural network training offers a means of reducing that computing time. Two key issues in the implementation of nonlinear programming are the choice of a method for computing search direction and the degree of accuracy required of the subsequent line search. This paper examines these issues through a designed experiment using six different pattern classification tasks, four search direction methods (conjugate gradient, quasi-Newton, and two levels of limited memory quasi-Newton), and three levels of line search accuracy. It was found that for the simplest pattern classification problems, the conjugate gradient performed well. For more complicated pattern classification problems, the limited memory BFGS or the BFGS should be preferred. For very large problems, the best choice seems to be the limited memory BFGS. It was also determined that, for the line search methods used in this study, increasing accuracy did not improve efficiency.
引用
收藏
页码:358 / 368
页数:11
相关论文
共 35 条
[1]   LEARNING NETWORKS FOR EXTRAPOLATION AND RADAR TARGET IDENTIFICATION [J].
BAI, BC ;
FARHAT, NH .
NEURAL NETWORKS, 1992, 5 (03) :507-529
[2]   OPTIMIZATION FOR TRAINING NEURAL NETS [J].
BARNARD, E .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02) :232-240
[3]   1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD [J].
BATTITI, R .
NEURAL COMPUTATION, 1992, 4 (02) :141-166
[4]   A LEARNING ALGORITHM FOR MULTILAYERED NEURAL NETWORKS BASED ON LINEAR LEAST-SQUARES PROBLEMS [J].
BIEGLERKONIG, F ;
BARMANN, F .
NEURAL NETWORKS, 1993, 6 (01) :127-131
[5]  
Broyden C.G., 1970, J I MATH ITS APPL, V6, P76, DOI DOI 10.1093/IMAMAT/6.1.76
[6]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[7]  
FISHMAN MB, 1991, TECHNICAL ANAL S APR, P18
[8]   FUNCTION MINIMIZATION BY CONJUGATE GRADIENTS [J].
FLETCHER, R ;
REEVES, CM .
COMPUTER JOURNAL, 1964, 7 (02) :149-&
[9]   A NEW APPROACH TO VARIABLE METRIC ALGORITHMS [J].
FLETCHER, R .
COMPUTER JOURNAL, 1970, 13 (03) :317-&
[10]  
Fletcher R., 1981, PRACTICAL METHODS OP