Stochastic Cubic Regularization for Fast Nonconvex Optimization

被引:0
作者
Tripuraneni, Nilesh [1 ]
Stern, Mitchell [1 ]
Jin, Chi [1 ]
Regier, Jeffrey [1 ]
Jordan, Michael I. [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a stochastic variant of a classic algorithm-the cubic-regularized Newton method [Nesterov and Polyak, 2006]. The proposed algorithm efficiently escapes saddle points and finds approximate local minima for general smooth, nonconvex functions in only (O) over tilde (c(-3.5)) stochastic gradient and stochastic Hessian-vector product evaluations. The latter can be computed as efficiently as stochastic gradients. This improves upon the (O) over tilde(epsilon(-4)) rate of stochastic gradient descent. Our rate matches the best-known result for finding local minima without requiring any delicate acceleration or variance-reduction techniques.
引用
收藏
页数:10
相关论文
共 27 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Allen-Zhu Z., 2017, ARXIV171106673
  • [3] Allen-Zhu Z Natasha, 2017, ARXIV170808694
  • [4] [Anonymous], 2017, ARXIV170603131
  • [5] [Anonymous], 2015, P 18 INT C ART INT S
  • [6] [Anonymous], 2017, P 34 INT C MACH LEAR
  • [7] [Anonymous], 2017, ARXIV171101944
  • [8] [Anonymous], ARXIV161100756
  • [9] [Anonymous], 2016, ARXIV161200547
  • [10] [Anonymous], 2013, ADV NEURAL INFORM PR