A fully stochastic second-order trust region method

被引:7
|
作者
Curtis, Frank E. [1 ]
Shi, Rui [1 ]
机构
[1] Lehigh Univ, Dept Ind & Syst Engn, Bethlehem, PA 18015 USA
来源
OPTIMIZATION METHODS & SOFTWARE | 2022年 / 37卷 / 03期
基金
美国国家科学基金会;
关键词
Stochastic optimization; finite-sum optimization; stochastic Newton methods; trust region methods; machine learning; deep neural networks; time series forecasting; QUASI-NEWTON METHOD; OPTIMIZATION METHODS;
D O I
10.1080/10556788.2020.1852403
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A stochastic second-order trust region method is proposed, which can be viewed as an extension of the trust-region-ish (TRish) algorithm proposed by Curtis et al. [A stochastic trust region algorithm based on careful step normalization. INFORMS J. Optim. 1(3) 200-220, 2019]. In each iteration, a search direction is computed by (approximately) solving a subproblem defined by stochastic gradient and Hessian estimates. The algorithm has convergence guarantees in the fully stochastic regime, i.e. when each stochastic gradient is merely an unbiased estimate of the gradient with bounded variance and the stochastic Hessian estimates are bounded. This framework covers a variety of implementations, such as when the stochastic Hessians are defined by sampled second-order derivatives or diagonal matrices, such as in RMSprop, Adagrad, Adam and other popular algorithms. The proposed algorithm has a worst-case complexity guarantee in the nearly deterministic regime, i.e. when the stochastic gradients and Hessians are close in expectation to the true gradients and Hessians. The results of numerical experiments for training CNNs for image classification and an RNN for time series forecasting are presented. These results show that the algorithm can outperform a stochastic gradient and first-order TRish algorithm.
引用
收藏
页码:844 / 877
页数:34
相关论文
共 50 条
  • [41] A second-order stochastic resonance method enhanced by fractional-order derivative for mechanical fault detection
    Qiao, Zijian
    Elhattab, Ahmed
    Shu, Xuedao
    He, Changbo
    NONLINEAR DYNAMICS, 2021, 106 (01) : 707 - 723
  • [42] INTRUSION DETECTION BASED ON THE SECOND-ORDER STOCHASTIC MODEL
    Zhang Xiaoqiang Zhu Zhongliang Fan Pingzhi (School of Logistics
    JournalofElectronics(China), 2007, (05) : 679 - 685
  • [43] A second-order stochastic filter involving coordinate transformation
    Nam, K
    Tahk, MJ
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1999, 44 (03) : 603 - 608
  • [44] LINEARIZATION OF A SECOND-ORDER STOCHASTIC ORDINARY DIFFERENTIAL EQUATION
    Meleshko, Sergey V.
    Schulz, Eckart
    JOURNAL OF NONLINEAR MATHEMATICAL PHYSICS, 2011, 18 (03) : 427 - 441
  • [45] Between First- and Second-Order Stochastic Dominance
    Mueller, Alfred
    Scarsini, Marco
    Tsetlin, Ilia
    Winkler, Robert L.
    MANAGEMENT SCIENCE, 2017, 63 (09) : 2933 - 2947
  • [46] Numerical methods for second-order stochastic differential equations
    Burrage, Kevin
    Lenane, Ian
    Lythe, Grant
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2007, 29 (01): : 245 - 264
  • [47] Fast Second-Order Stochastic Backpropagation for Variational Inference
    Fan, Kai
    Wang, Ziteng
    Beck, Jeffrey
    Kwok, James T.
    Heller, Katherine
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [48] A second-order stochastic dominance portfolio efficiency measure
    Kopa, Milos
    Chovanec, Petr
    KYBERNETIKA, 2008, 44 (02) : 243 - 258
  • [49] Stochastic flows for nonlinear second-order parabolic SPDE
    Flandoli, F
    ANNALS OF PROBABILITY, 1996, 24 (02): : 547 - 558
  • [50] Testing first- and second-order stochastic dominance
    Xu, K
    Fisher, G
    Willson, D
    CANADIAN JOURNAL OF ECONOMICS-REVUE CANADIENNE D ECONOMIQUE, 1996, 29 : S562 - S564