Partial Differential Equations for Training Deep Neural Networks

被引:0
作者
Chaudhari, Pratik [1 ]
Oberman, Adam [2 ]
Osher, Stanley [3 ]
Soatto, Stefano [1 ]
Carlier, Guillaume [4 ]
机构
[1] Univ Calif Los Angeles, Comp Sci, Los Angeles, CA 90095 USA
[2] McGill Univ, Dept Math & Stat, Montreal, PQ, Canada
[3] Univ Calif Los Angeles, Dept Math, Los Angeles, CA 90024 USA
[4] Univ Paris IX Dauphine, CEREMADE, Paris, France
来源
2017 FIFTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS | 2017年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper establishes a connection between non-convex optimization and nonlinear partial differential equations (PDEs). We interpret empirically successful relaxation techniques motivated from statistical physics for training deep neural networks as solutions of a viscous Hamilton-Jacobi (HJ) PDE. The underlying stochastic control interpretation allows us to prove that these techniques perform better than stochastic gradient descent. Our analysis provides insight into the geometry of the energy landscape and suggests new algorithms based on the non-viscous Hamilton-Jacobi PDE that can effectively tackle the high dimensionality of modern neural networks.
引用
收藏
页码:1627 / 1631
页数:5
相关论文
共 19 条
  • [1] [Anonymous], 2012, DETERMINISTIC STOCHA
  • [2] [Anonymous], 2011, NIPS
  • [3] Local entropy as a measure for sampling solutions in constraint satisfaction problems
    Baldassi, Carlo
    Ingrosso, Alessandro
    Lucibello, Carlo
    Saglietti, Luca
    Zecchina, Riccardo
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2016,
  • [4] Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses
    Baldassi, Carlo
    Ingrosso, Alessandro
    Lucibello, Carlo
    Saglietti, Luca
    Zecchina, Riccardo
    [J]. PHYSICAL REVIEW LETTERS, 2015, 115 (12)
  • [5] Cannarsa P., 2004, PROG NONLIN, V58
  • [6] Chaudhari P., 2017, COMMUNICATIONS PURE
  • [7] Chaudhari Pratik, 2016, arXiv preprint arXiv:1611.01838
  • [8] Evans Lawrence C., 2010, Graduate Studies in Mathematics, V19, DOI [10.1090/gsm/019, DOI 10.1090/GSM/019]
  • [9] Krizhevsky A., 2009, LEARNING MULTIPLE LA
  • [10] Deep learning
    LeCun, Yann
    Bengio, Yoshua
    Hinton, Geoffrey
    [J]. NATURE, 2015, 521 (7553) : 436 - 444