Conservative SPDEs as fluctuating mean field limits of stochastic gradient descent

被引：0

作者：

Gess, Benjamin ^{[1
,2
]}

Gvalani, Rishabh S. ^{[3
]}

Konarovskyi, Vitalii ^{[4
,5
]}

机构：

[1] Tech Univ Berlin, Inst Math, Str 17,Juni 136, D-10623 Berlin, Germany

[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany

[3] Swiss Fed Inst Technol, D MATH, CH-8092 Zurich, Switzerland

[4] Univ Hamburg, Fak Math Informat & Nat Wissensch, D-20146 Hamburg, Germany

[5] NAS Ukraine, Inst Math, UA-01024 Kyiv, Ukraine

来源：

PROBABILITY THEORY AND RELATED FIELDS | 2025年

关键词：

Stochastic gradient descent; Machine learning; Overparametrization; Dean-Kawasaki equation; SDE with interaction; Fluctuation mean field limit; Law of large numbers; Central limit theorem; PARTIAL-DIFFERENTIAL-EQUATIONS; NEURAL-NETWORKS; SYSTEM; DEVIATIONS; PARTICLES; MODEL;

D O I：

10.1007/s00440-024-01353-6

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

The convergence of stochastic interacting particle systems in the mean-field limit to solutions of conservative stochastic partial differential equations is established, with optimal rate of convergence. As a second main result, a quantitative central limit theorem for such SPDEs is derived, again, with optimal rate of convergence. The results apply, in particular, to the convergence in the mean-field scaling of stochastic gradient descent dynamics in overparametrized, shallow neural networks to solutions of SPDEs. It is shown that the inclusion of fluctuations in the limiting SPDE improves the rate of convergence, and retains information about the fluctuations of stochastic gradient descent in the continuum limit.

引用

页数：69

共 50 条

[21] Recent Advances in Stochastic Gradient Descent in Deep Learning
Tian, Yingjie
Zhang, Yuqi
Zhang, Haibin
MATHEMATICS, 2023, 11 (03)
[22] Learning Stochastic Optimal Policies via Gradient Descent
Massaroli, Stefano
Poli, Michael
Peluchetti, Stefano
Park, Jinkyoo
Yamashita, Atsushi
Asama, Hajime
IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 1094 - 1099
[23] On the discrepancy principle for stochastic gradient descent
Jahn, Tim
Jin, Bangti
INVERSE PROBLEMS, 2020, 36 (09)
[24] On the Generalization of Stochastic Gradient Descent with Momentum
Ramezani-Kebrya, Ali
Antonakopoulos, Kimon
Cevher, Volkan
Khisti, Ashish
Liang, Ben
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
[25] Graph Drawing by Stochastic Gradient Descent
Zheng, Jonathan X.
Pawar, Samraat
Goodman, Dan F. M.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (09) : 2738 - 2748
[26] On the different regimes of stochastic gradient descent
Sclocchi, Antonio
Wyart, Matthieu
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 121 (09)
[27] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
Archibald, Richard
Bao, Feng
Yong, Jiongmin
EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
[28] The effective noise of stochastic gradient descent
Mignacco, Francesca
Urbani, Pierfrancesco
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (08):
[29] On the regularizing property of stochastic gradient descent
Jin, Bangti
Lu, Xiliang
INVERSE PROBLEMS, 2019, 35 (01)
[30] A Distributed Optimal Control Problem with Averaged Stochastic Gradient Descent
Sun, Qi
Du, Qiang
COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2020, 27 (03) : 753 - 774

← 1 2 3 4 5 →