Rates of superlinear convergence for classical quasi-Newton methods

被引：0

作者：

Anton Rodomanov

Yurii Nesterov

机构：

[1] Catholic University of Louvain,Institute of Information and Communication Technologies, Electronics and Applied Mathematics

[2] Catholic University of Louvain,Center for Operations Research and Econometrics

来源：

Mathematical Programming | 2022年 / 194卷

关键词：

Quasi-Newton methods; Convex Broyden class; DFP; BFGS; Superlinear convergence; Local convergence; Rate of convergence; 90C53; 90C30; 68Q25;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We study the local convergence of classical quasi-Newton methods for nonlinear optimization. Although it was well established a long time ago that asymptotically these methods converge superlinearly, the corresponding rates of convergence still remain unknown. In this paper, we address this problem. We obtain first explicit non-asymptotic rates of superlinear convergence for the standard quasi-Newton methods, which are based on the updating formulas from the convex Broyden class. In particular, for the well-known DFP and BFGS methods, we obtain the rates of the form (nL2μ2k)k/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\frac{n L^2}{\mu ^2 k})^{k/2}$$\end{document} and (nLμk)k/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\frac{n L}{\mu k})^{k/2}$$\end{document} respectively, where k is the iteration counter, n is the dimension of the problem, μ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu $$\end{document} is the strong convexity parameter, and L is the Lipschitz constant of the gradient.

引用

页码：159 / 190

页数：31

共 49 条

[1] Fletcher R(1963)A rapidly convergent descent method for minimization Comput. J. 6 163-168
[2] Powell M(1970)The convergence of a class of double-rank minimization algorithms: 1. General considerations IMA J. Appl. Math. 6 76-90
[3] Broyden C(1970)The convergence of a class of double-rank minimization algorithms: 2. The new algorithm IMA J. Appl. Math. 6 222-231
[4] Broyden C(1970)A new approach to variable metric algorithms Comput. J. 13 317-322
[5] Fletcher R(1970)A family of variable-metric methods derived by variational means Math. Comput. 24 23-26
[6] Goldfarb D(1970)Conditioning of quasi-Newton methods for function minimization Math. Comput. 24 647-656
[7] Shanno D(1977)Quasi-Newton methods, motivation and theory SIAM Rev. 19 46-89
[8] Dennis J(2013)Nonsmooth optimization via quasi-Newton methods Math. Program. 141 135-163
[9] Moré J(2017)Randomized quasi-Newton updates are linearly convergent matrix inversion algorithms SIAM J. Matrix Anal. Appl. 38 1380-1409
[10] Lewis A(1971)On the convergence of the variable metric algorithm IMA J. Appl. Math. 7 21-36

← 1 2 3 4 5 →