Rates of superlinear convergence for classical quasi-Newton methods

被引:0
作者
Anton Rodomanov
Yurii Nesterov
机构
[1] Catholic University of Louvain,Institute of Information and Communication Technologies, Electronics and Applied Mathematics
[2] Catholic University of Louvain,Center for Operations Research and Econometrics
来源
Mathematical Programming | 2022年 / 194卷
关键词
Quasi-Newton methods; Convex Broyden class; DFP; BFGS; Superlinear convergence; Local convergence; Rate of convergence; 90C53; 90C30; 68Q25;
D O I
暂无
中图分类号
学科分类号
摘要
We study the local convergence of classical quasi-Newton methods for nonlinear optimization. Although it was well established a long time ago that asymptotically these methods converge superlinearly, the corresponding rates of convergence still remain unknown. In this paper, we address this problem. We obtain first explicit non-asymptotic rates of superlinear convergence for the standard quasi-Newton methods, which are based on the updating formulas from the convex Broyden class. In particular, for the well-known DFP and BFGS methods, we obtain the rates of the form (nL2μ2k)k/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\frac{n L^2}{\mu ^2 k})^{k/2}$$\end{document} and (nLμk)k/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\frac{n L}{\mu k})^{k/2}$$\end{document} respectively, where k is the iteration counter, n is the dimension of the problem, μ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu $$\end{document} is the strong convexity parameter, and L is the Lipschitz constant of the gradient.
引用
收藏
页码:159 / 190
页数:31
相关论文
共 49 条
  • [1] Fletcher R(1963)A rapidly convergent descent method for minimization Comput. J. 6 163-168
  • [2] Powell M(1970)The convergence of a class of double-rank minimization algorithms: 1. General considerations IMA J. Appl. Math. 6 76-90
  • [3] Broyden C(1970)The convergence of a class of double-rank minimization algorithms: 2. The new algorithm IMA J. Appl. Math. 6 222-231
  • [4] Broyden C(1970)A new approach to variable metric algorithms Comput. J. 13 317-322
  • [5] Fletcher R(1970)A family of variable-metric methods derived by variational means Math. Comput. 24 23-26
  • [6] Goldfarb D(1970)Conditioning of quasi-Newton methods for function minimization Math. Comput. 24 647-656
  • [7] Shanno D(1977)Quasi-Newton methods, motivation and theory SIAM Rev. 19 46-89
  • [8] Dennis J(2013)Nonsmooth optimization via quasi-Newton methods Math. Program. 141 135-163
  • [9] Moré J(2017)Randomized quasi-Newton updates are linearly convergent matrix inversion algorithms SIAM J. Matrix Anal. Appl. 38 1380-1409
  • [10] Lewis A(1971)On the convergence of the variable metric algorithm IMA J. Appl. Math. 7 21-36