Conservative SPDEs as fluctuating mean field limits of stochastic gradient descent

被引:0
|
作者
Gess, Benjamin [1 ,2 ]
Gvalani, Rishabh S. [3 ]
Konarovskyi, Vitalii [4 ,5 ]
机构
[1] Tech Univ Berlin, Inst Math, Str 17,Juni 136, D-10623 Berlin, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Swiss Fed Inst Technol, D MATH, CH-8092 Zurich, Switzerland
[4] Univ Hamburg, Fak Math Informat & Nat Wissensch, D-20146 Hamburg, Germany
[5] NAS Ukraine, Inst Math, UA-01024 Kyiv, Ukraine
关键词
Stochastic gradient descent; Machine learning; Overparametrization; Dean-Kawasaki equation; SDE with interaction; Fluctuation mean field limit; Law of large numbers; Central limit theorem; PARTIAL-DIFFERENTIAL-EQUATIONS; NEURAL-NETWORKS; SYSTEM; DEVIATIONS; PARTICLES; MODEL;
D O I
10.1007/s00440-024-01353-6
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The convergence of stochastic interacting particle systems in the mean-field limit to solutions of conservative stochastic partial differential equations is established, with optimal rate of convergence. As a second main result, a quantitative central limit theorem for such SPDEs is derived, again, with optimal rate of convergence. The results apply, in particular, to the convergence in the mean-field scaling of stochastic gradient descent dynamics in overparametrized, shallow neural networks to solutions of SPDEs. It is shown that the inclusion of fluctuations in the limiting SPDE improves the rate of convergence, and retains information about the fluctuations of stochastic gradient descent in the continuum limit.
引用
收藏
页数:69
相关论文
共 50 条
  • [1] Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
    Gess, Benjamin
    Kassing, Sebastian
    Konarovskyi, Vitalii
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [2] Rigorous Dynamical Mean-Field Theory for Stochastic Gradient Descent Methods
    Gerbelot, Cedric
    Troiani, Emanuele
    Mignacco, Francesca
    Krzakala, Florent
    Zdeborova, Lenka
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2024, 6 (02): : 400 - 427
  • [3] A note on diffusion limits for stochastic gradient descent
    Lanconelli, Alberto
    Lauria, Christopher S. A.
    JOURNAL OF APPROXIMATION THEORY, 2025, 309
  • [4] Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification*
    Mignacco, Francesca
    Krzakala, Florent
    Urbani, Pierfrancesco
    Zdeborova, And Lenka
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2021, 2021 (12):
  • [5] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [6] The Improved Stochastic Fractional Order Gradient Descent Algorithm
    Yang, Yang
    Mo, Lipo
    Hu, Yusen
    Long, Fei
    FRACTAL AND FRACTIONAL, 2023, 7 (08)
  • [7] Stochastic Gradient Descent in Continuous Time
    Sirignano, Justin
    Spiliopoulos, Konstantinos
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961
  • [8] Convergent Stochastic Almost Natural Gradient Descent
    Sanchez-Lopez, Borja
    Cerquides, Jesus
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 54 - 63
  • [9] The Impact of Synchronization in Parallel Stochastic Gradient Descent
    Backstrom, Karl
    Papatriantafilou, Marina
    Tsigas, Philippas
    DISTRIBUTED COMPUTING AND INTELLIGENT TECHNOLOGY, ICDCIT 2022, 2022, 13145 : 60 - 75
  • [10] BAYESIAN STOCHASTIC GRADIENT DESCENT FOR STOCHASTIC OPTIMIZATION WITH STREAMING INPUT DATA
    Liu, Tianyi
    Lin, Yifan
    Zhou, Enlu
    SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (01) : 389 - 418