Neural Network Training With Levenberg-Marquardt and Adaptable Weight Compression

被引:69
|
作者
Smith, James S. [1 ]
Wu, Bo [1 ,2 ]
Wilamowski, Bogdan M. [1 ,3 ]
机构
[1] Auburn Univ, Dept Elect & Comp Engn, Auburn, AL 36849 USA
[2] Jinan Univ, Big Data Decis Inst, Guangzhou 510632, Guangdong, Peoples R China
[3] Univ IT & Management, PL-35225 Rzeszow, Poland
关键词
Diminishing gradient; Levenberg-Marquardt (LM) algorithm; neural network training; weight compression;
D O I
10.1109/TNNLS.2018.2846775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Difficult experiments in training neural networks often fail to converge due to what is known as the flatspot problem, where the gradient of hidden neurons in the network diminishes in value, rending the weight update process ineffective. Whereas a first-order algorithm can address this issue by learning parameters to normalize neuron activations, the second-order algorithms cannot afford additional parameters given that they include a large Jacobian matrix calculation. This paper proposes Levenberg-Marquardt with weight compression (LM-WC), which combats the flat-spot problem by compressing neuron weights to push neuron activation out of the saturated region and close to the linear region. The presented algorithm requires no additional learned parameters and contains an adaptable compression parameter, which is adjusted to avoid training failure and increase the probability of neural network convergence. Several experiments are presented and discussed to demonstrate the success of LM-WC against standard LM and LMwith random restarts on benchmark data sets for varying network architectures. Our results suggest that the LM-WC algorithm can improve training success by 10 times or more compared with other methods.
引用
收藏
页码:580 / 587
页数:8
相关论文
共 50 条
  • [41] Application of BP Neural Network Based on Levenberg-Marquardt Algorithm in Appraisal Analysis
    He Houfeng
    Wang Baoguo
    PROCEEDINGS OF THE 9TH CONFERENCE ON MAN-MACHINE-ENVIRONMENT SYSTEM ENGINEERING, 2009, : 266 - 270
  • [42] A Levenberg-Marquardt Based Neural Network for Short-Term Load Forecasting
    Ali, Saqib
    Riaz, Shazia
    Safoora
    Liu, Xiangyong
    Wang, Guojun
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (01): : 1783 - 1800
  • [43] Combining Genetic Algorithm and Levenberg-Marquardt Algorithm in Training Neural Network for Hypoglycemia Detection using EEG Signals
    Nguyen, Lien B.
    Nguyen, Anh V.
    Ling, Sai Ho
    Nguyen, Hung T.
    2013 35TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2013, : 5386 - 5389
  • [44] A New Levenberg-Marquardt Algorithm for feedforward neural networks
    Li, Yanlai
    Wang, Kuanquan
    Li, Tao
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13E : 3516 - 3519
  • [45] A note on the Levenberg-Marquardt parameter
    Fan, Jinyan
    Pan, Jianyu
    APPLIED MATHEMATICS AND COMPUTATION, 2009, 207 (02) : 351 - 359
  • [46] On-line training of neural networks: A sliding window approach for the Levenberg-Marquardt algorithm
    Dias, FM
    Antunes, A
    Vieira, J
    Mota, AM
    ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING APPLICATIONS: A BIOINSPIRED APPROACH, PT 2, PROCEEDINGS, 2005, 3562 : 577 - 585
  • [47] An echo state network based on Levenberg-Marquardt algorithm
    Wang, Lei
    Yang, Cuili
    Qiao, Junfei
    Wang, Gongming
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 3899 - 3904
  • [48] A Parallel Levenberg-Marquardt Algorithm
    Cao, Jun
    Novstrup, Krista A.
    Goyal, Ayush
    Midkiff, Samuel R.
    Caruthers, James M.
    ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2009, : 450 - 459
  • [49] Geometric Algebra Levenberg-Marquardt
    De Keninck, Steven
    Dorst, Leo
    ADVANCES IN COMPUTER GRAPHICS, CGI 2019, 2019, 11542 : 511 - 522
  • [50] On-line sliding-window Levenberg-Marquardt methods for neural network models
    Ferreira, P. M.
    Ruano, A. E.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING, CONFERENCE PROCEEDINGS BOOK, 2007, : 163 - 168