Neural Network Training With Levenberg-Marquardt and Adaptable Weight Compression

被引:69
|
作者
Smith, James S. [1 ]
Wu, Bo [1 ,2 ]
Wilamowski, Bogdan M. [1 ,3 ]
机构
[1] Auburn Univ, Dept Elect & Comp Engn, Auburn, AL 36849 USA
[2] Jinan Univ, Big Data Decis Inst, Guangzhou 510632, Guangdong, Peoples R China
[3] Univ IT & Management, PL-35225 Rzeszow, Poland
关键词
Diminishing gradient; Levenberg-Marquardt (LM) algorithm; neural network training; weight compression;
D O I
10.1109/TNNLS.2018.2846775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Difficult experiments in training neural networks often fail to converge due to what is known as the flatspot problem, where the gradient of hidden neurons in the network diminishes in value, rending the weight update process ineffective. Whereas a first-order algorithm can address this issue by learning parameters to normalize neuron activations, the second-order algorithms cannot afford additional parameters given that they include a large Jacobian matrix calculation. This paper proposes Levenberg-Marquardt with weight compression (LM-WC), which combats the flat-spot problem by compressing neuron weights to push neuron activation out of the saturated region and close to the linear region. The presented algorithm requires no additional learned parameters and contains an adaptable compression parameter, which is adjusted to avoid training failure and increase the probability of neural network convergence. Several experiments are presented and discussed to demonstrate the success of LM-WC against standard LM and LMwith random restarts on benchmark data sets for varying network architectures. Our results suggest that the LM-WC algorithm can improve training success by 10 times or more compared with other methods.
引用
收藏
页码:580 / 587
页数:8
相关论文
共 50 条
  • [21] Performance of Levenberg-Marquardt Neural Network Algorithm in Air Quality Forecasting
    Mun, Cho Kar
    Abd Rahman, Nur Haizum
    Ilias, Iszuanie Syafidza Che
    SAINS MALAYSIANA, 2022, 51 (08): : 2645 - 2654
  • [22] Research and application of RBF neural network based on modified Levenberg-Marquardt
    Yang, Yanxia
    Wang, Pu
    Gao, Xuejin
    Gao, Huihui
    Qi, Zeyang
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (05) : 1597 - 1619
  • [23] Accelerated Levenberg-Marquardt Algorithm for Radial Basis Function Neural Network
    Ma Miaoli
    Wu Xiaolong
    Han Honggui
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 6804 - 6809
  • [24] GPU Implementation of the Feedforward Neural Network with Modified Levenberg-Marquardt Algorithm
    Tomislav, Bacek
    Majetic, Dubravko
    Brezak, Danko
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 785 - 791
  • [25] Performance of the Levenberg-Marquardt neural network approach in nuclear mass prediction
    Zhang, Hai Fei
    Wang, Li Hao
    Yin, Jing Peng
    Chen, Peng Hui
    Zhang, Hong Fei
    JOURNAL OF PHYSICS G-NUCLEAR AND PARTICLE PHYSICS, 2017, 44 (04)
  • [26] An artificial neural network and Levenberg-Marquardt training algorithm-based mathematical model for performance prediction
    Bano, Farheen
    Serbaya, Suhail H.
    Rizwan, Ali
    Shabaz, Mohammad
    Hasan, Faraz
    Khalifa, Hany S.
    APPLIED MATHEMATICS IN SCIENCE AND ENGINEERING, 2024,
  • [27] The assessment of Levenberg-Marquardt and Bayesian Framework training algorithm for prediction of concrete shrinkage by the artificial neural network
    Garoosiha, Hosein
    Ahmadi, Jamal
    Bayat, Hossein
    COGENT ENGINEERING, 2019, 6 (01):
  • [28] Modified Levenberg-Marquardt Algorithm for Backpropagation Neural Network Training in Dynamic Model Identification of Mechanical Systems
    Li, Ming
    Wu, Huapeng
    Wang, Yongbo
    Handroos, Heikki
    Carbone, Giuseppe
    JOURNAL OF DYNAMIC SYSTEMS MEASUREMENT AND CONTROL-TRANSACTIONS OF THE ASME, 2017, 139 (03):
  • [29] Fast Computational Approach to the Levenberg-Marquardt Algorithm for Training Feedforward Neural Networks
    Bilski, Jaroslaw
    Smolag, Jacek
    Kowalczyk, Bartosz
    Grzanek, Konrad
    Izonin, Ivan
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2023, 13 (02) : 45 - 61
  • [30] Nonmonotone Levenberg-Marquardt training of recurrent neural architectures for processing symbolic sequences
    Peng, Chun-Cheng
    Magoulas, George D.
    NEURAL COMPUTING & APPLICATIONS, 2011, 20 (06): : 897 - 908