共 21 条
- [1] Allen-Zhu Z(2018)Katyusha: The first direct acceleration of stochastic gradient methods J. Mach. Learn. Res. 18 1-51
- [2] Bottou L(2018)Optimization methods for large-scale machine learning SIAM Rev. 60 223-311
- [3] Curtis EF(2011)Adaptive subgradient methods for online learning and stochastic optimization J. Mach. Learn. Res. 12 2121-2159
- [4] Nocedal J(1998)Gradient-based learning applied to document recognition Proc. IEEE 81 2278-2324
- [5] Duchi J(2017)Stochastic gradient descent as approximate bayesian inference J. Mach. Learn. Res. 18 1-35
- [6] Hazan E(1983)A method for solving the convex programming problem with convergence rate Dokl. Akad. Nauk SSSR 269 543-547
- [7] Singer Y(1999)On the momentum term in gradient descent learning algorithms Neural Networks 12 145-151
- [8] LeCun Y(1951)A stochastic approximation method Ann. Math. Stat. 22 400-407
- [9] Bottou L(1996)Convergence analysis of gradient descent stochastic algorithms J. Optim. Theory Appl. 91 439-454
- [10] Bengio Y(2016)Mastering the game of go with deep neural networks and tree search Nature 529 484-489