Combining Stochastic Adaptive Cubic Regularization with Negative Curvature for Nonconvex Optimization

被引:15
作者
Park, Seonho [1 ]
Jung, Seung Hyun [2 ]
Pardalos, Panos M. [1 ]
机构
[1] Univ Florida, Ctr Appl Optimizat, Gainesville, FL 32611 USA
[2] Korea Inst Ind Technol KITECH, Daegu, South Korea
基金
新加坡国家研究基金会;
关键词
Adaptive cubic-regularized Newton method; Cubic regularization; Trust-region method; Negative curvature; Nonconvex optimization; Worst-case complexity; QUASI-NEWTON METHOD;
D O I
10.1007/s10957-019-01624-6
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We focus on minimizing nonconvex finite-sum functions that typically arise in machine learning problems. In an attempt to solve this problem, the adaptive cubic-regularized Newton method has shown its strong global convergence guarantees and the ability to escape from strict saddle points. In this paper, we expand this algorithm to incorporating the negative curvature method to update even at unsuccessful iterations. We call this new method Stochastic Adaptive cubic regularization with Negative Curvature (SANC). Unlike the previous method, in order to attain stochastic gradient and Hessian estimators, the SANC algorithm uses independent sets of data points of consistent size over all iterations. It makes the SANC algorithm more practical to apply for solving large-scale machine learning problems. To the best of our knowledge, this is the first approach that combines the negative curvature method with the adaptive cubic-regularized Newton method. Finally, we provide experimental results, including neural networks problems supporting the efficiency of our method.
引用
收藏
页码:953 / 971
页数:19
相关论文
共 34 条
  • [1] Agarwal N, 2017, J MACH LEARN RES, V18
  • [2] Finding Approximate Local Minima Faster than Gradient Descent
    Agarwal, Naman
    Allen-Zhu, Zeyuan
    Bullins, Brian
    Hazan, Elad
    Ma, Tengyu
    [J]. STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 1195 - 1199
  • [3] Allen-Zhu Z, 2018, Advances in Neural Information Processing Systems, V2675, P2686
  • [4] [Anonymous], ARXIV160204915
  • [5] [Anonymous], 2011, P 28 INT C MACHINE L
  • [6] [Anonymous], ARXIV170505933
  • [7] [Anonymous], 2017, ARXIV171005782
  • [8] [Anonymous], 2016, ARXIV161200547
  • [9] [Anonymous], 2018, ARXIV181003763
  • [10] [Anonymous], NA12