Accelerated affine-invariant convergence rates of the Frank-Wolfe algorithm with open-loop step-sizes

被引:1
作者
Wirth, Elias [1 ]
Pena, Javier [2 ]
Pokutta, Sebastian [1 ]
机构
[1] Berlin Inst Technol, Inst Math, Str 17 Juni 135, Berlin, Germany
[2] Carnegie Mellon Univ, Tepper Sch Business, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
基金
美国安德鲁·梅隆基金会;
关键词
D O I
10.1007/s10107-024-02180-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recent papers have shown that the Frank-Wolfe algorithm (FW) with open-loop step-sizes exhibits rates of convergence faster than the iconic O(t-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(t<^>{-1})$$\end{document} rate. In particular, when the minimizer of a strongly convex function over a polytope lies on the boundary of the polytope, the FW algorithm with open-loop step-sizes eta t=& ell;t+& ell;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _t = \frac{\ell }{t+\ell }$$\end{document} for & ell;is an element of N >= 2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in \mathbb {N}_{\ge 2}$$\end{document} has accelerated convergence O(t-2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(t<^>{-2})$$\end{document} in contrast to the rate Omega(t-1-& varepsilon;)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (t<^>{-1-\epsilon })$$\end{document} attainable with more complex line-search or short-step step-sizes. Given the relevance of this scenario in data science problems, research has grown to explore the settings enabling acceleration in open-loop FW. However, despite FW's well-known affine invariance, existing acceleration results for open-loop FW are affine-dependent. This paper remedies this gap in the literature, by merging two recent research trajectories: affine invariance (Pe & ntilde;a in SIAM J. Optim. 33(4):2654-2674, 2023) and open-loop step-sizes (Wirth et al. in Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023). In particular, we extend all known non-affine-invariant convergence rates for FW with open-loop step-sizes to affine-invariant results.
引用
收藏
页数:45
相关论文
共 47 条
  • [21] The MovieLens Datasets: History and Context
    Harper, F. Maxwell
    Konstan, Joseph A.
    [J]. ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 5 (04)
  • [22] Hiriart-Urruty Jean-Baptiste, 2004, Fundamentals of convex analysis
  • [23] Holloway C. A., 1974, Mathematical Programming, V6, P14, DOI 10.1007/BF01580219
  • [24] ROBUST ESTIMATION OF LOCATION PARAMETER
    HUBER, PJ
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (01): : 73 - &
  • [25] Jaggi M., 2013, INT C MACHINE LEARNI, P427
  • [26] Joulin A, 2014, LECT NOTES COMPUT SC, V8694, P253, DOI 10.1007/978-3-319-10599-4_17
  • [27] Kerdreux T, 2021, PR MACH LEARN RES, V139
  • [28] Kerdreux T, 2021, PR MACH LEARN RES, V130, P19
  • [29] Lacoste-Julien S., 2013, arXiv
  • [30] Lacoste-Julien Simon, 2015, Advances in Neural Information Processing Systems, V28