共 38 条
[1]
Bertsekas DP(2000)Gradient convergence in gradient methods with errors SIAM J. Optim. 10 627-642
[2]
Tsitsiklis JN(2022)A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions J. Complex. 21 1-48
[3]
Cheridito P(2020)Non-convergence of stochastic gradient descent in the training of deep neural networks J. Complex. 57 101438-444
[4]
Jentzen A(2020)Convergence rates for the stochastic gradient descent method for non-convex objective functions J. Mach. Learn. Res. 521 436-337
[5]
Riekert A(2020)Lower error bounds for the stochastic gradient descent optimization algorithm: sharp convergence rates for slowly and fast decaying learning rates J. Complex. 176 311-4400
[6]
Rossmannek F(2015)Deep learning Nature 31 4394-1706
[7]
Cheridito P(2019)First-order methods almost always avoid strict saddle points Math. Program. 28 1671-547
[8]
Jentzen A(2020)Stochastic gradient descent for nonconvex learning without bounded gradient assumptions IEEE Trans. Neural Netw. Learn. Syst. 269 543-74
[9]
Rossmannek F(2020)Dying ReLU and initialization: theory and numerical examples Commun. Comput. Phys. 1 39-492
[10]
Fehrman B(1983)A method for solving the convex programming problem with convergence rate Proc. USSR Acad. Sci. 109 467-undefined