Stochastic Optimization for Nonconvex Problem With Inexact Hessian Matrix, Gradient, and Function

被引:2
|
作者
Liu, Liu [1 ,2 ]
Liu, Xuanqing [3 ]
Hsieh, Cho-Jui [4 ]
Tao, Dacheng [5 ]
机构
[1] Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China
[2] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[3] Amazon Web Serv, Seattle, WA 98108 USA
[4] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
[5] Univ Sydney, Sch Comp Sci, Sydney AI Ctr, Fac Engn, Sydney, NSW 2008, Australia
关键词
Optimization; Convergence; Complexity theory; Stochastic processes; Linear matrix inequalities; Approximation algorithms; Training; Adaptive regularization; stochastic optimization; trust region (TR); ALGORITHMS; REGULARIZATION; COMPLEXITY;
D O I
10.1109/TNNLS.2023.3326177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Trust region (TR) and adaptive regularization using cubics (ARC) have proven to have some very appealing theoretical properties for nonconvex optimization by concurrently computing function value, gradient, and Hessian matrix to obtain the next search direction and the adjusted parameters. Although stochastic approximations help largely reduce the computational cost, it is challenging to theoretically guarantee the convergence rate. In this article, we explore a family of stochastic TR (STR) and stochastic ARC (SARC) methods that can simultaneously provide inexact computations of the Hessian matrix, gradient, and function values. Our algorithms require much fewer propagations overhead per iteration than TR and ARC. We prove that the iteration complexity to achieve epsilon-approximate second-order optimality is of the same order as the exact computations demonstrated in previous studies. In addition, the mild conditions on inexactness can be met by leveraging a random sampling technology in the finite-sum minimization problem. Numerical experiments with a nonconvex problem support these findings and demonstrate that, with the same or a similar number of iterations, our algorithms require less computational overhead per iteration than current secondorder methods.
引用
收藏
页码:1651 / 1663
页数:13
相关论文
共 50 条
  • [31] Nonasymptotic Bounds for Stochastic Optimization With Biased Noisy Gradient Oracles
    Bhavsar, Nirav
    Prashanth, L. A.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (03) : 1628 - 1641
  • [32] Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization
    Ömer Deniz Akyildiz
    Dan Crisan
    Joaquín Míguez
    Statistics and Computing, 2020, 30 : 1645 - 1663
  • [33] Adaptive Stochastic Conjugate Gradient Optimization for Backpropagation Neural Networks
    Hashem, Ibrahim Abaker Targio
    Alaba, Fadele Ayotunde
    Jumare, Muhammad Haruna
    Ibrahim, Ashraf Osman
    Abulfaraj, Anas Waleed
    IEEE ACCESS, 2024, 12 : 33757 - 33768
  • [34] A Communication-Efficient Stochastic Gradient Descent Algorithm for Distributed Nonconvex Optimization
    Xie, Antai
    Yi, Xinlei
    Wang, Xiaofan
    Cao, Ming
    Ren, Xiaoqiang
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, : 609 - 614
  • [35] From inexact optimization to learning via gradient concentration
    Stankewitz, Bernhard
    Muecke, Nicole
    Rosasco, Lorenzo
    COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2023, 84 (01) : 265 - 294
  • [36] Unified Algorithm Framework for Nonconvex Stochastic Optimization in Deep Neural Networks
    Zhu, Yini
    Iiduka, Hideaki
    IEEE ACCESS, 2021, 9 : 143807 - 143823
  • [37] A Unified Model for Large-Scale Inexact Fixed-Point Iteration: A Stochastic Optimization Perspective
    Hashemi, Abolfazl
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2025, 70 (04) : 2435 - 2449
  • [38] SPECTRAL PROJECTED GRADIENT METHOD WITH INEXACT RESTORATION FOR MINIMIZATION WITH NONCONVEX CONSTRAINTS
    Gomes-Ruggiero, M. A.
    Martinez, J. M.
    Santos, S. A.
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2009, 31 (03) : 1628 - 1652
  • [39] New proximal bundle algorithm based on the gradient sampling method for nonsmooth nonconvex optimization with exact and inexact information
    Monjezi, N. Hoseini
    Nobakhtian, S.
    NUMERICAL ALGORITHMS, 2023, 94 (02) : 765 - 787
  • [40] Stochastic Augmented Projected Gradient Methods for the Large-Scale Precoding Matrix Indicator Selection Problem
    Zhang, Jiaqi
    Jin, Zeyu
    Jiang, Bo
    Wen, Zaiwen
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (11) : 9553 - 9565