Conservative SPDEs as fluctuating mean field limits of stochastic gradient descent

被引：0

作者：

Gess, Benjamin ^{[1
,2
]}

Gvalani, Rishabh S. ^{[3
]}

Konarovskyi, Vitalii ^{[4
,5
]}

机构：

[1] Tech Univ Berlin, Inst Math, Str 17,Juni 136, D-10623 Berlin, Germany

[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany

[3] Swiss Fed Inst Technol, D MATH, CH-8092 Zurich, Switzerland

[4] Univ Hamburg, Fak Math Informat & Nat Wissensch, D-20146 Hamburg, Germany

[5] NAS Ukraine, Inst Math, UA-01024 Kyiv, Ukraine

来源：

PROBABILITY THEORY AND RELATED FIELDS | 2025年

关键词：

Stochastic gradient descent; Machine learning; Overparametrization; Dean-Kawasaki equation; SDE with interaction; Fluctuation mean field limit; Law of large numbers; Central limit theorem; PARTIAL-DIFFERENTIAL-EQUATIONS; NEURAL-NETWORKS; SYSTEM; DEVIATIONS; PARTICLES; MODEL;

D O I：

10.1007/s00440-024-01353-6

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

The convergence of stochastic interacting particle systems in the mean-field limit to solutions of conservative stochastic partial differential equations is established, with optimal rate of convergence. As a second main result, a quantitative central limit theorem for such SPDEs is derived, again, with optimal rate of convergence. The results apply, in particular, to the convergence in the mean-field scaling of stochastic gradient descent dynamics in overparametrized, shallow neural networks to solutions of SPDEs. It is shown that the inclusion of fluctuations in the limiting SPDE improves the rate of convergence, and retains information about the fluctuations of stochastic gradient descent in the continuum limit.

引用

页数：69

共 50 条

[1] Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
Gess, Benjamin
Kassing, Sebastian
Konarovskyi, Vitalii
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[2] Rigorous Dynamical Mean-Field Theory for Stochastic Gradient Descent Methods
Gerbelot, Cedric
Troiani, Emanuele
Mignacco, Francesca
Krzakala, Florent
Zdeborova, Lenka
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2024, 6 (02): : 400 - 427
[3] A note on diffusion limits for stochastic gradient descent
Lanconelli, Alberto
Lauria, Christopher S. A.
JOURNAL OF APPROXIMATION THEORY, 2025, 309
[4] Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification*
Mignacco, Francesca
Krzakala, Florent
Urbani, Pierfrancesco
Zdeborova, And Lenka
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2021, 2021 (12):
[5] Unforgeability in Stochastic Gradient Descent
Baluta, Teodora
Nikolic, Ivica
Jain, Racchit
Aggarwal, Divesh
Saxena, Prateek
PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
[6] The Improved Stochastic Fractional Order Gradient Descent Algorithm
Yang, Yang
Mo, Lipo
Hu, Yusen
Long, Fei
FRACTAL AND FRACTIONAL, 2023, 7 (08)
[7] Stochastic Gradient Descent in Continuous Time
Sirignano, Justin
Spiliopoulos, Konstantinos
SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961
[8] Convergent Stochastic Almost Natural Gradient Descent
Sanchez-Lopez, Borja
Cerquides, Jesus
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 54 - 63
[9] The Impact of Synchronization in Parallel Stochastic Gradient Descent
Backstrom, Karl
Papatriantafilou, Marina
Tsigas, Philippas
DISTRIBUTED COMPUTING AND INTELLIGENT TECHNOLOGY, ICDCIT 2022, 2022, 13145 : 60 - 75
[10] BAYESIAN STOCHASTIC GRADIENT DESCENT FOR STOCHASTIC OPTIMIZATION WITH STREAMING INPUT DATA
Liu, Tianyi
Lin, Yifan
Zhou, Enlu
SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (01) : 389 - 418

← 1 2 3 4 5 →