Accelerated affine-invariant convergence rates of the Frank-Wolfe algorithm with open-loop step-sizes

被引：2

作者：

Wirth, Elias ^{[1
]}

Pena, Javier ^{[2
]}

Pokutta, Sebastian ^{[1
]}

机构：

[1] Berlin Inst Technol, Inst Math, Str 17 Juni 135, Berlin, Germany

[2] Carnegie Mellon Univ, Tepper Sch Business, 5000 Forbes Ave, Pittsburgh, PA 15213 USA

来源：

MATHEMATICAL PROGRAMMING | 2025年

基金：

美国安德鲁·梅隆基金会;

关键词：

D O I：

10.1007/s10107-024-02180-2

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recent papers have shown that the Frank-Wolfe algorithm (FW) with open-loop step-sizes exhibits rates of convergence faster than the iconic O(t-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(t<^>{-1})$$\end{document} rate. In particular, when the minimizer of a strongly convex function over a polytope lies on the boundary of the polytope, the FW algorithm with open-loop step-sizes eta t=& ell;t+& ell;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _t = \frac{\ell }{t+\ell }$$\end{document} for & ell;is an element of N >= 2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in \mathbb {N}_{\ge 2}$$\end{document} has accelerated convergence O(t-2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(t<^>{-2})$$\end{document} in contrast to the rate Omega(t-1-& varepsilon;)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (t<^>{-1-\epsilon })$$\end{document} attainable with more complex line-search or short-step step-sizes. Given the relevance of this scenario in data science problems, research has grown to explore the settings enabling acceleration in open-loop FW. However, despite FW's well-known affine invariance, existing acceleration results for open-loop FW are affine-dependent. This paper remedies this gap in the literature, by merging two recent research trajectories: affine invariance (Pe & ntilde;a in SIAM J. Optim. 33(4):2654-2674, 2023) and open-loop step-sizes (Wirth et al. in Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023). In particular, we extend all known non-affine-invariant convergence rates for FW with open-loop step-sizes to affine-invariant results.

引用

页数：45

共 47 条

[21] The MovieLens Datasets: History and Context [J].

Harper, F. Maxwell ;

Konstan, Joseph A. .

ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 5 (04)

[22]

Hiriart-Urruty J.-B., 2004, Fundamentals of Convex Analysis

[23]

Holloway C. A., 1974, Mathematical Programming, V6, P14, DOI 10.1007/BF01580219

[24] ROBUST ESTIMATION OF LOCATION PARAMETER [J].

HUBER, PJ .

ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (01) :73-&

[25]

Jaggi M., 2013, P 30 INT C INT C MAC, V28, P427

[26]

Joulin A, 2014, LECT NOTES COMPUT SC, V8694, P253, DOI 10.1007/978-3-319-10599-4_17

[27]

Kerdreux T, 2021, PR MACH LEARN RES, V139

[28]

Kerdreux T, 2021, PR MACH LEARN RES, V130, P19

[29]

Lacoste-Julien S., 2013, arXiv

[30]

Lacoste-Julien Simon, 2015, ADV NEURAL INFORM PR, V28

← 1 2 3 4 5 →