Accelerated affine-invariant convergence rates of the Frank-Wolfe algorithm with open-loop step-sizes

被引:1
作者
Wirth, Elias [1 ]
Pena, Javier [2 ]
Pokutta, Sebastian [1 ]
机构
[1] Berlin Inst Technol, Inst Math, Str 17 Juni 135, Berlin, Germany
[2] Carnegie Mellon Univ, Tepper Sch Business, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
基金
美国安德鲁·梅隆基金会;
关键词
D O I
10.1007/s10107-024-02180-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recent papers have shown that the Frank-Wolfe algorithm (FW) with open-loop step-sizes exhibits rates of convergence faster than the iconic O(t-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(t<^>{-1})$$\end{document} rate. In particular, when the minimizer of a strongly convex function over a polytope lies on the boundary of the polytope, the FW algorithm with open-loop step-sizes eta t=& ell;t+& ell;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _t = \frac{\ell }{t+\ell }$$\end{document} for & ell;is an element of N >= 2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in \mathbb {N}_{\ge 2}$$\end{document} has accelerated convergence O(t-2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(t<^>{-2})$$\end{document} in contrast to the rate Omega(t-1-& varepsilon;)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (t<^>{-1-\epsilon })$$\end{document} attainable with more complex line-search or short-step step-sizes. Given the relevance of this scenario in data science problems, research has grown to explore the settings enabling acceleration in open-loop FW. However, despite FW's well-known affine invariance, existing acceleration results for open-loop FW are affine-dependent. This paper remedies this gap in the literature, by merging two recent research trajectories: affine invariance (Pe & ntilde;a in SIAM J. Optim. 33(4):2654-2674, 2023) and open-loop step-sizes (Wirth et al. in Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023). In particular, we extend all known non-affine-invariant convergence rates for FW with open-loop step-sizes to affine-invariant results.
引用
收藏
页数:45
相关论文
共 47 条
  • [1] On the Effectiveness of Richardson Extrapolation in Data Science
    Bach, Francis
    [J]. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2021, 3 (04): : 1251 - 1277
  • [2] Bach Francis, 2012, ICML 12, P1355
  • [3] Braun G., 2019, INT C MACH LEARN, P735
  • [4] Braun G, 2023, Arxiv, DOI arXiv:2211.14103
  • [5] Bugg CX, 2022, ADV NEUR IN
  • [6] Carderera A, 2021, Arxiv, DOI arXiv:2101.02630
  • [7] Coresets, Sparse Greedy Approximation, and the Frank-Wolfe Algorithm
    Clarkson, Kenneth L.
    [J]. ACM TRANSACTIONS ON ALGORITHMS, 2010, 6 (04)
  • [8] Optimal Transport for Domain Adaptation
    Courty, Nicolas
    Flamary, Remi
    Tuia, Devis
    Rakotomamonjy, Alain
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (09) : 1853 - 1865
  • [9] CONDITIONAL GRADIENT ALGORITHMS WITH OPEN LOOP STEP SIZE RULES
    DUNN, JC
    HARSHBARGER, S
    [J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1978, 62 (02) : 432 - 444
  • [10] Frank M., 1956, Naval Res. Logist. Quart., V3, P95