Stochastic Optimization for Nonconvex Problem With Inexact Hessian Matrix, Gradient, and Function

被引:2
|
作者
Liu, Liu [1 ,2 ]
Liu, Xuanqing [3 ]
Hsieh, Cho-Jui [4 ]
Tao, Dacheng [5 ]
机构
[1] Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China
[2] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[3] Amazon Web Serv, Seattle, WA 98108 USA
[4] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
[5] Univ Sydney, Sch Comp Sci, Sydney AI Ctr, Fac Engn, Sydney, NSW 2008, Australia
关键词
Optimization; Convergence; Complexity theory; Stochastic processes; Linear matrix inequalities; Approximation algorithms; Training; Adaptive regularization; stochastic optimization; trust region (TR); ALGORITHMS; REGULARIZATION; COMPLEXITY;
D O I
10.1109/TNNLS.2023.3326177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Trust region (TR) and adaptive regularization using cubics (ARC) have proven to have some very appealing theoretical properties for nonconvex optimization by concurrently computing function value, gradient, and Hessian matrix to obtain the next search direction and the adjusted parameters. Although stochastic approximations help largely reduce the computational cost, it is challenging to theoretically guarantee the convergence rate. In this article, we explore a family of stochastic TR (STR) and stochastic ARC (SARC) methods that can simultaneously provide inexact computations of the Hessian matrix, gradient, and function values. Our algorithms require much fewer propagations overhead per iteration than TR and ARC. We prove that the iteration complexity to achieve epsilon-approximate second-order optimality is of the same order as the exact computations demonstrated in previous studies. In addition, the mild conditions on inexactness can be met by leveraging a random sampling technology in the finite-sum minimization problem. Numerical experiments with a nonconvex problem support these findings and demonstrate that, with the same or a similar number of iterations, our algorithms require less computational overhead per iteration than current secondorder methods.
引用
收藏
页码:1651 / 1663
页数:13
相关论文
共 50 条
  • [21] Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization
    Lin, Tianyi
    Zheng, Zeyu
    Jordan, Michael I.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [22] Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle
    Dvurechensky, Pavel
    Gasnikov, Alexander
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2016, 171 (01) : 121 - 145
  • [23] A proximal trust-region method for nonsmooth optimization with inexact function and gradient evaluations
    Baraldi, Robert J. J.
    Kouri, Drew P. P.
    MATHEMATICAL PROGRAMMING, 2023, 201 (1-2) : 559 - 598
  • [24] A Stochastic Quasi-Newton Method for Large-Scale Nonconvex Optimization With Applications
    Chen, Huiming
    Wu, Ho-Chun
    Chan, Shing-Chow
    Lam, Wong-Hing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4776 - 4790
  • [25] Accelerated gradient methods for nonconvex nonlinear and stochastic programming
    Ghadimi, Saeed
    Lan, Guanghui
    MATHEMATICAL PROGRAMMING, 2016, 156 (1-2) : 59 - 99
  • [26] Accelerated Distributed Stochastic Nonconvex Optimization Over Time-Varying Directed Networks
    Chen, Yiyue
    Hashemi, Abolfazl
    Vikalo, Haris
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2025, 70 (04) : 2196 - 2211
  • [27] ε-Approximation of Adaptive Leaning Rate Optimization Algorithms for Constrained Nonconvex Stochastic Optimization
    Iiduka, Hideaki
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 8108 - 8115
  • [28] ADAPTIVE REGULARIZATION ALGORITHMS WITH INEXACT EVALUATIONS FOR NONCONVEX OPTIMIZATION
    Bellavia, Stefania
    Guriol, Gianmarco
    Morini, Benedetta
    Toint, Philippe L.
    SIAM JOURNAL ON OPTIMIZATION, 2019, 29 (04) : 2881 - 2915
  • [29] An inexact Newton method for nonconvex equality constrained optimization
    Byrd, Richard H.
    Curtis, Frank E.
    Nocedal, Jorge
    MATHEMATICAL PROGRAMMING, 2010, 122 (02) : 273 - 299
  • [30] GNSD: A GRADIENT-TRACKING BASED NONCONVEX STOCHASTIC ALGORITHM FOR DECENTRALIZED OPTIMIZATION
    Lu, Songtao
    Zhang, Xinwei
    Sun, Haoran
    Hong, Mingyi
    2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 315 - 321