Curvature-corrected learning dynamics in deep neural networks

被引:0
作者
Huh, Dongsung [1 ]
机构
[1] MIT IBM Watson AI Lab, Cambridge, MA 02142 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119 | 2020年 / 119卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks exhibit complex learning dynamics due to their non-convex loss landscapes. Second-order optimization methods facilitate learning dynamics by compensating for ill-conditioned curvature. In this work, we investigate how curvature correction modifies the learning dynamics in deep linear neural networks and provide analytical solutions. We derive a generalized conservation law that preserves the path of parameter dynamics from curvature correction, which shows that curvature correction only modifies the temporal profiles of dynamics along the path. We show that while curvature correction accelerates the convergence dynamics of the input-output map, it can also negatively affect the generalization performance. Our analysis also reveals an undesirable effect of curvature correction that compromises stability of parameters dynamics during learning, especially with block-diagonal approximation of natural gradient descent. We introduce fractional curvature correction that resolves this problem while retaining most of the acceleration benefits of full curvature correction.
引用
收藏
页数:9
相关论文
共 50 条
[31]   Learning dynamics of gradient descent optimization in deep neural networks [J].
Wu, Wei ;
Jing, Xiaoyuan ;
Du, Wencai ;
Chen, Guoliang .
SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)
[32]   Learning dynamics of gradient descent optimization in deep neural networks [J].
Wei WU ;
Xiaoyuan JING ;
Wencai DU ;
Guoliang CHEN .
Science China(Information Sciences), 2021, 64 (05) :17-31
[33]   Learning dynamics of gradient descent optimization in deep neural networks [J].
Wei Wu ;
Xiaoyuan Jing ;
Wencai Du ;
Guoliang Chen .
Science China Information Sciences, 2021, 64
[34]   Curvature-corrected dilatonic black holes and black hole-string transition [J].
Gal'tsov, D. V. ;
Davydov, E. A. .
JETP LETTERS, 2009, 89 (03) :102-107
[35]   Superior-order curvature-corrected voltage references using double differential structures [J].
Popa, C. .
2006 IEEE-TTTC INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS, VOL 2, PROCEEDINGS, 2006, :65-68
[36]   BiCMOS-Based Compensation: Toward Fully Curvature-Corrected Bandgap Reference Circuits [J].
Huang, Yi ;
Zhu, Li ;
Kong, Fanpeng ;
Cheung, Chun ;
Najafizadeh, Laleh .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (04) :1210-1223
[37]   Design and simulation of Piecewise Curvature-Corrected Bandgap Reference Circuitry for low temperature coefficient [J].
Rhaffor, Nuha A. ;
Hamid, Sofiyah Sal ;
Abd Manaf, Asrulnizam .
2019 INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND SMART DEVICES (ISESD 2019): FUTURE SMART DEVICES AND NANOTECHNOLOGY FOR MICROELECTRONICS, 2019,
[38]   A New Curvature-corrected CMOS Bandgap Voltage Reference Using Current Sink Technique [J].
Rhaffor, Nuha A. ;
Zawawi, Ruhaifi Abdullah ;
Mohd, Shukri Korakkottil Kunhi ;
Hamid, Sofiyah Sal ;
Abd Manaf, Asrulnizam .
PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON IMAGING, SIGNAL PROCESSING AND COMMUNICATION, 2015, :176-179
[39]   Deep Learning and Artificial Neural Networks for Spacecraft Dynamics, Navigation and Control [J].
Silvestrini, Stefano ;
Lavagna, Michele .
DRONES, 2022, 6 (10)
[40]   An efficient deep learning approach to identify dynamics in in vitro neural networks [J].
Pastore, Vito Paolo ;
Parodi, Giulia ;
Brofiga, Martina ;
Massobrio, Paolo ;
Chiappalone, Michela ;
Odone, Francesca ;
Martinoia, Sergio .
2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,