Conservative SPDEs as fluctuating mean field limits of stochastic gradient descent

被引:0
|
作者
Gess, Benjamin [1 ,2 ]
Gvalani, Rishabh S. [3 ]
Konarovskyi, Vitalii [4 ,5 ]
机构
[1] Tech Univ Berlin, Inst Math, Str 17,Juni 136, D-10623 Berlin, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Swiss Fed Inst Technol, D MATH, CH-8092 Zurich, Switzerland
[4] Univ Hamburg, Fak Math Informat & Nat Wissensch, D-20146 Hamburg, Germany
[5] NAS Ukraine, Inst Math, UA-01024 Kyiv, Ukraine
关键词
Stochastic gradient descent; Machine learning; Overparametrization; Dean-Kawasaki equation; SDE with interaction; Fluctuation mean field limit; Law of large numbers; Central limit theorem; PARTIAL-DIFFERENTIAL-EQUATIONS; NEURAL-NETWORKS; SYSTEM; DEVIATIONS; PARTICLES; MODEL;
D O I
10.1007/s00440-024-01353-6
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The convergence of stochastic interacting particle systems in the mean-field limit to solutions of conservative stochastic partial differential equations is established, with optimal rate of convergence. As a second main result, a quantitative central limit theorem for such SPDEs is derived, again, with optimal rate of convergence. The results apply, in particular, to the convergence in the mean-field scaling of stochastic gradient descent dynamics in overparametrized, shallow neural networks to solutions of SPDEs. It is shown that the inclusion of fluctuations in the limiting SPDE improves the rate of convergence, and retains information about the fluctuations of stochastic gradient descent in the continuum limit.
引用
收藏
页数:69
相关论文
共 50 条
  • [21] Recent Advances in Stochastic Gradient Descent in Deep Learning
    Tian, Yingjie
    Zhang, Yuqi
    Zhang, Haibin
    MATHEMATICS, 2023, 11 (03)
  • [22] Learning Stochastic Optimal Policies via Gradient Descent
    Massaroli, Stefano
    Poli, Michael
    Peluchetti, Stefano
    Park, Jinkyoo
    Yamashita, Atsushi
    Asama, Hajime
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 1094 - 1099
  • [23] On the discrepancy principle for stochastic gradient descent
    Jahn, Tim
    Jin, Bangti
    INVERSE PROBLEMS, 2020, 36 (09)
  • [24] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [25] Graph Drawing by Stochastic Gradient Descent
    Zheng, Jonathan X.
    Pawar, Samraat
    Goodman, Dan F. M.
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (09) : 2738 - 2748
  • [26] On the different regimes of stochastic gradient descent
    Sclocchi, Antonio
    Wyart, Matthieu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 121 (09)
  • [27] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
    Archibald, Richard
    Bao, Feng
    Yong, Jiongmin
    EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
  • [28] The effective noise of stochastic gradient descent
    Mignacco, Francesca
    Urbani, Pierfrancesco
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (08):
  • [29] On the regularizing property of stochastic gradient descent
    Jin, Bangti
    Lu, Xiliang
    INVERSE PROBLEMS, 2019, 35 (01)
  • [30] A Distributed Optimal Control Problem with Averaged Stochastic Gradient Descent
    Sun, Qi
    Du, Qiang
    COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2020, 27 (03) : 753 - 774