Neural Network Training With Levenberg-Marquardt and Adaptable Weight Compression

被引：69

作者：

Smith, James S. ^{[1
]}

Wu, Bo ^{[1
,2
]}

Wilamowski, Bogdan M. ^{[1
,3
]}

机构：

[1] Auburn Univ, Dept Elect & Comp Engn, Auburn, AL 36849 USA

[2] Jinan Univ, Big Data Decis Inst, Guangzhou 510632, Guangdong, Peoples R China

[3] Univ IT & Management, PL-35225 Rzeszow, Poland

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2019年 / 30卷 / 02期

关键词：

Diminishing gradient; Levenberg-Marquardt (LM) algorithm; neural network training; weight compression;

D O I：

10.1109/TNNLS.2018.2846775

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Difficult experiments in training neural networks often fail to converge due to what is known as the flatspot problem, where the gradient of hidden neurons in the network diminishes in value, rending the weight update process ineffective. Whereas a first-order algorithm can address this issue by learning parameters to normalize neuron activations, the second-order algorithms cannot afford additional parameters given that they include a large Jacobian matrix calculation. This paper proposes Levenberg-Marquardt with weight compression (LM-WC), which combats the flat-spot problem by compressing neuron weights to push neuron activation out of the saturated region and close to the linear region. The presented algorithm requires no additional learned parameters and contains an adaptable compression parameter, which is adjusted to avoid training failure and increase the probability of neural network convergence. Several experiments are presented and discussed to demonstrate the success of LM-WC against standard LM and LMwith random restarts on benchmark data sets for varying network architectures. Our results suggest that the LM-WC algorithm can improve training success by 10 times or more compared with other methods.

引用

页码：580 / 587

页数：8

共 50 条

[21] Performance of Levenberg-Marquardt Neural Network Algorithm in Air Quality Forecasting
Mun, Cho Kar
Abd Rahman, Nur Haizum
Ilias, Iszuanie Syafidza Che
SAINS MALAYSIANA, 2022, 51 (08): : 2645 - 2654
[22] Research and application of RBF neural network based on modified Levenberg-Marquardt
Yang, Yanxia
Wang, Pu
Gao, Xuejin
Gao, Huihui
Qi, Zeyang
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (05) : 1597 - 1619
[23] Accelerated Levenberg-Marquardt Algorithm for Radial Basis Function Neural Network
Ma Miaoli
Wu Xiaolong
Han Honggui
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 6804 - 6809
[24] GPU Implementation of the Feedforward Neural Network with Modified Levenberg-Marquardt Algorithm
Tomislav, Bacek
Majetic, Dubravko
Brezak, Danko
PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 785 - 791
[25] Performance of the Levenberg-Marquardt neural network approach in nuclear mass prediction
Zhang, Hai Fei
Wang, Li Hao
Yin, Jing Peng
Chen, Peng Hui
Zhang, Hong Fei
JOURNAL OF PHYSICS G-NUCLEAR AND PARTICLE PHYSICS, 2017, 44 (04)
[26] An artificial neural network and Levenberg-Marquardt training algorithm-based mathematical model for performance prediction
Bano, Farheen
Serbaya, Suhail H.
Rizwan, Ali
Shabaz, Mohammad
Hasan, Faraz
Khalifa, Hany S.
APPLIED MATHEMATICS IN SCIENCE AND ENGINEERING, 2024,
[27] The assessment of Levenberg-Marquardt and Bayesian Framework training algorithm for prediction of concrete shrinkage by the artificial neural network
Garoosiha, Hosein
Ahmadi, Jamal
Bayat, Hossein
COGENT ENGINEERING, 2019, 6 (01):
[28] Modified Levenberg-Marquardt Algorithm for Backpropagation Neural Network Training in Dynamic Model Identification of Mechanical Systems
Li, Ming
Wu, Huapeng
Wang, Yongbo
Handroos, Heikki
Carbone, Giuseppe
JOURNAL OF DYNAMIC SYSTEMS MEASUREMENT AND CONTROL-TRANSACTIONS OF THE ASME, 2017, 139 (03):
[29] Fast Computational Approach to the Levenberg-Marquardt Algorithm for Training Feedforward Neural Networks
Bilski, Jaroslaw
Smolag, Jacek
Kowalczyk, Bartosz
Grzanek, Konrad
Izonin, Ivan
JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2023, 13 (02) : 45 - 61
[30] Nonmonotone Levenberg-Marquardt training of recurrent neural architectures for processing symbolic sequences
Peng, Chun-Cheng
Magoulas, George D.
NEURAL COMPUTING & APPLICATIONS, 2011, 20 (06): : 897 - 908

← 1 2 3 4 5 →