DEEP LEARNING AS OPTIMAL CONTROL PROBLEMS: MODELS AND NUMERICAL METHODS

被引:38
作者
Benning, Martin [1 ]
Celledoni, Elena [2 ]
Ehrhardt, Matthias J. [3 ]
Owren, Brynjulf [2 ]
Schonlieb, Carola-Bibiane [4 ]
机构
[1] Queen Mary Univ London, Sch Math Sci, London E1 4NS, England
[2] NTNU, Dept Math Sci, N-7491 Trondheim, Norway
[3] Univ Bath, Inst Math Innovat, Bath BA2 7JU, Avon, England
[4] Univ Cambridge, Dept Appl Math & Theoret Phys, Cambridge CB3 0WA, England
来源
JOURNAL OF COMPUTATIONAL DYNAMICS | 2019年 / 6卷 / 02期
基金
英国工程与自然科学研究理事会;
关键词
Deep learning; optimal control; Runge-Kutta methods; Hamiltonian boundary value problems; EQUATIONS; SCHEMES;
D O I
10.3934/jcd.2019009
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider recent work of [18] and [9], where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.
引用
收藏
页码:171 / 198
页数:28
相关论文
共 48 条
  • [1] Machine Learning: Deepest Learning as Statistical Data Assimilation Problems
    Abarbanel, Henry D., I
    Rozdeba, Paul J.
    Shirman, Sasha
    [J]. NEURAL COMPUTATION, 2018, 30 (08) : 2025 - 2055
  • [2] AGRACHEV A. A., 2004, ENCY MATH SCI, V87
  • [3] Modern regularization methods for inverse problems
    Benning, Martin
    Burger, Martin
    [J]. ACTA NUMERICA, 2018, 27 : 1 - 111
  • [4] Bishop CM, 2006, Pattern recognition and machine learning, DOI [10.1007/978-0-387-45528-0, DOI 10.1007/978-0-387-45528-0]
  • [5] On the necessity of negative coefficients for operator splitting schemes of order higher than two
    Blanes, S
    Casas, F
    [J]. APPLIED NUMERICAL MATHEMATICS, 2005, 54 (01) : 23 - 37
  • [7] Burger M, 2006, COMMUN MATH SCI, V4, P179
  • [8] Burger M, 2005, LECT NOTES COMPUT SC, V3752, P25
  • [9] Chang B, 2018, AAAI CONF ARTIF INTE, P2811
  • [10] Chen R. T., 2018, P ADV NEUR INF PROC, V31