共 44 条
[1]
Robbins H(1951)A stochastic approximation method Ann. Math. Stat. 22 400-407
[2]
Monro S(2008)Performance analysis of stochastic gradient algorithms under weak conditions Sci. China Ser. F: Inf. Sci. 51 1269-1280
[3]
Ding F(2018)Optimization methods for large-scale machine learning SIAM Rev. 60 223-311
[4]
Yang HZ(2015)Deep learning Nature 521 436-444
[5]
Liu F(2017)Minimizing finite sums with the stochastic average gradient Math. Program. 162 83-112
[6]
Bottou L(2020)Inexact SARAH algorithm for stochastic optimization Optim. Method. Softw. 36 237-258
[7]
Curtis FE(1998)Online learning and stochastic approximations Online Learn. Neural Netw. 17 9-42
[8]
Nocedal J(2011)Adaptive subgradient methods for online learning and stochastic optimization J. Mach. Learn. Res. 12 2121-2159
[9]
LeCun Y(1988)Two-point step size gradient methods IMA J. Numer. Anal. 8 141-148
[10]
Bengio Y(2002)R-linear convergence of the Barzilai and Borwein gradient method IMA J. Numer. Anal. 22 1-10