Curvature-corrected learning dynamics in deep neural networks

被引:0
作者
Huh, Dongsung [1 ]
机构
[1] MIT IBM Watson AI Lab, Cambridge, MA 02142 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119 | 2020年 / 119卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks exhibit complex learning dynamics due to their non-convex loss landscapes. Second-order optimization methods facilitate learning dynamics by compensating for ill-conditioned curvature. In this work, we investigate how curvature correction modifies the learning dynamics in deep linear neural networks and provide analytical solutions. We derive a generalized conservation law that preserves the path of parameter dynamics from curvature correction, which shows that curvature correction only modifies the temporal profiles of dynamics along the path. We show that while curvature correction accelerates the convergence dynamics of the input-output map, it can also negatively affect the generalization performance. Our analysis also reveals an undesirable effect of curvature correction that compromises stability of parameters dynamics during learning, especially with block-diagonal approximation of natural gradient descent. We introduce fractional curvature correction that resolves this problem while retaining most of the acceleration benefits of full curvature correction.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Curvature-Corrected Surface Boundary Condition for the Euler Equations on Cartesian Grids
    Sang Weimin
    Xu Lu
    Lei Xiwei
    PROCEEDINGS OF 2010 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL 1 AND 2, 2010, : 283 - +
  • [22] Learning Graph Dynamics using Deep Neural Networks
    Narayan, Apurva
    Roe, Peter H. O'N
    IFAC PAPERSONLINE, 2018, 51 (02): : 433 - 438
  • [23] Anomalous diffusion dynamics of learning in deep neural networks
    Chen, Guozhang
    Qu, Cheng Kevin
    Gong, Pulin
    NEURAL NETWORKS, 2022, 149 : 18 - 28
  • [24] A Multi-Piecewise Curvature-Corrected Technique for Bandgap Reference Circuits
    Huang, Yi
    Cheung, Chun
    Najafizadeh, Laleh
    2013 IEEE 56TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2013, : 305 - 308
  • [25] A multiple transistor combination low-voltage curvature-corrected bandgap reference
    苏凯
    龚敏
    秦怀斌
    孙晨
    Journal of Semiconductors, 2013, 34 (06) : 152 - 156
  • [26] CMOS LOGARITHMIC CURVATURE-CORRECTED VOLTAGE REFERENCE BY USING A MULTIPLE DIFFERENTIAL STRUCTURE
    Popa, Cosmin
    REVUE ROUMAINE DES SCIENCES TECHNIQUES-SERIE ELECTROTECHNIQUE ET ENERGETIQUE, 2010, 55 (04): : 436 - 444
  • [27] CMOS logarithmic curvature-corrected voltage reference using a multiple differential structure
    Popa, C
    ISSCS 2005: International Symposium on Signals, Circuits and Systems, Vols 1 and 2, Proceedings, 2005, : 413 - 416
  • [28] Superior-Order Curvature-Corrected Voltage Reference Using a Current Generator
    Popa, Cosmin
    ARTIFICIAL NEURAL NETWORKS-ICANN 2010, PT I, 2010, 6352 : 12 - 21
  • [29] Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks
    La Malfa, Emanuele
    La Malfa, Gabriele
    Nicosia, Giuseppe
    Latora, Vito
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 344 - 351
  • [30] A multiple transistor combination low-voltage curvature-corrected bandgap reference
    Su Kai
    Gong Min
    Qin Huaibin
    Sun Chen
    JOURNAL OF SEMICONDUCTORS, 2013, 34 (06)