Stochastic gradient descent;
Online learning;
Functional central limit theorem;
Mixing;
Markov chains in random environments;
DEPENDENT DATA STREAMS;
APPROXIMATION;
D O I:
10.1007/s00245-023-10052-y
中图分类号:
O29 [应用数学];
学科分类号:
070104 ;
摘要:
We study the mixing properties of an important optimization algorithm of machine learning: the stochastic gradient Langevin dynamics (SGLD) with a fixed step size. The data stream is not assumed to be independent hence the SGLD is not a Markov chain, merely a Markov chain in a random environment, which complicates themathematical treatment considerably. We derive a strong law of large numbers and a functional central limit theorem for SGLD.
机构:
Fundacao Getulio Vargas, Sch Appl Math, 190 Praia Botafogo, Rio De Janeiro, BrazilFundacao Getulio Vargas, Sch Appl Math, 190 Praia Botafogo, Rio De Janeiro, Brazil
Cuicues, Vincent
Kraetschmer, Volker
论文数: 0引用数: 0
h-index: 0
机构:
Univ Duisburg Essen, Fac Math, Essen, GermanyFundacao Getulio Vargas, Sch Appl Math, 190 Praia Botafogo, Rio De Janeiro, Brazil
Kraetschmer, Volker
Shapiro, Alexander
论文数: 0引用数: 0
h-index: 0
机构:
Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA 30332 USAFundacao Getulio Vargas, Sch Appl Math, 190 Praia Botafogo, Rio De Janeiro, Brazil