On the Local Hessian in Back-propagation

被引:0
|
作者
Zhang, Huishuai [1 ]
Chen, Wei [1 ]
Liu, Tie-Yan [1 ]
机构
[1] Microsoft Res Asia, Beijing 100080, Peoples R China
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Back-propagation (BP) is the foundation for successfully training deep neural networks. However, BP sometimes has difficulties in propagating a learning signal deep enough effectively, e.g., the vanishing gradient phenomenon. Meanwhile, BP often works well when combining with "designing tricks" like orthogonal initialization, batch normalization and skip connection. There is no clear understanding on what is essential to the efficiency of BP. In this paper, we take one step towards clarifying this problem. We view BP as a solution of back-matching propagation which minimizes a sequence of back-matching losses each corresponding to one block of the network. We study the Hessian of the local back-matching loss (local Hessian) and connect it to the efficiency of BP. It turns out that those designing tricks facilitate BP by improving the spectrum of local Hessian. In addition, we can utilize the local Hessian to balance the training pace of each block and design new training algorithms. Based on a scalar approximation of local Hessian, we propose a scale-amended SGD algorithm. We apply it to train neural networks with batch normalization, and achieve favorable results over vanilla SGD. This corroborates the importance of local Hessian from another side.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] BACK-PROPAGATION LEARNING IN EXPERT NETWORKS
    LACHER, RC
    HRUSKA, SI
    KUNCICKY, DC
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (01): : 62 - 72
  • [2] Theories of Error Back-Propagation in the Brain
    Whittington, James C. R.
    Bogacz, Rafal
    TRENDS IN COGNITIVE SCIENCES, 2019, 23 (03) : 235 - 250
  • [3] An extension of the back-propagation algorithm to complex numbers
    Nitta, T
    NEURAL NETWORKS, 1997, 10 (08) : 1391 - 1415
  • [4] Geometric Back-Propagation in Morphological Neural Networks
    Groenendijk, Rick
    Dorst, Leo
    Gevers, Theo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 14045 - 14051
  • [5] NEURAL CONTROLLER BASED ON BACK-PROPAGATION ALGORITHM
    SAERENS, M
    SOQUET, A
    IEE PROCEEDINGS-F RADAR AND SIGNAL PROCESSING, 1991, 138 (01) : 55 - 62
  • [6] Hybrid back-propagation training with evolutionary strategies
    José Parra
    Leonardo Trujillo
    Patricia Melin
    Soft Computing, 2014, 18 : 1603 - 1614
  • [7] Hybrid back-propagation training with evolutionary strategies
    Parra, Jose
    Trujillo, Leonardo
    Melin, Patricia
    SOFT COMPUTING, 2014, 18 (08) : 1603 - 1614
  • [8] Reconstruction of CT images by the back-propagation algorithm
    Ohkawa, I
    Tobaru, S
    Nakao, Z
    Chen, YW
    1998 SECOND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ELECTRONIC SYSTEMS, KES '98, PROCEEDINGS, VOL, 3, 1998, : 150 - 154
  • [9] Effect of Derivative Action on Back-Propagation Algorithms
    Gurhanli, Ahmet
    Cevik, Taner
    Cevik, Nazife
    INNOVATIONS IN BIO-INSPIRED COMPUTING AND APPLICATIONS, 2019, 939 : 13 - 19
  • [10] Introducing the back-propagation into probabilistic neural network
    1600, Systems Engineering Society of China (34):