On the Convergence of Proximal Gradient Methods for Convex Simple Bilevel Optimization

被引:0
作者
Latafat, Puya [1 ]
Themelis, Andreas [2 ]
Villa, Silvia [3 ]
Patrinos, Panagiotis [4 ]
机构
[1] IMT Sch Adv Studies Lucca, Piazza S Francesco 19, I-55100 Lucca, Italy
[2] Kyushu Univ, Fac Informat Sci & Elect Engn ISEE, 744 Motooka,Nishi Ku, Fukuoka 8190395, Japan
[3] Univ Genoa, Dipartimento Matemat, Via Dodecaneso 35, I-16146 Genoa, Italy
[4] Katholieke Univ Leuven, Dept Elect Engn ESAT STADIUS, Kasteelpk Arenberg 10, B-3001 Leuven, Belgium
基金
日本学术振兴会; 欧盟地平线“2020”;
关键词
Convex optimization; Bilevel programming; Adaptive proximal gradient methods; Locally Lipschitz gradient; VISCOSITY APPROXIMATION METHODS; 1ST-ORDER METHOD; COMPLEXITY;
D O I
10.1007/s10957-024-02564-6
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper studies proximal gradient iterations for addressing simple bilevel optimization problems where both the upper and the lower level cost functions are split as the sum of differentiable and (possibly nonsmooth) prox-friendly functions. We develop a novel convergence recipe for iteration-varying stepsizes that relies on Barzilai-Borwein type local estimates for the differentiable terms. Leveraging the convergence recipe, under global Lipschitz gradient continuity, we establish convergence for a nonadaptive stepsize sequence, without requiring any strong convexity or linesearch. In the locally Lipschitz differentiable setting, we develop an adaptive linesearch method that introduces a systematic adaptive scheme enabling large and nonmonotonic stepsize sequences while being insensitive to the choice of hyperparameters and initialization. Numerical simulations are provided showcasing favorable convergence speed of our methods.
引用
收藏
页数:36
相关论文
共 41 条
[1]   Viscosity solutions of minimization problems [J].
Attouch, H .
SIAM JOURNAL ON OPTIMIZATION, 1996, 6 (03) :769-806
[2]  
Bahraoui MA., 1994, Set-Valued Anal, V2, P49, DOI [DOI 10.1007/BF01027092, 10.1007/BF01027092]
[3]   2-POINT STEP SIZE GRADIENT METHODS [J].
BARZILAI, J ;
BORWEIN, JM .
IMA JOURNAL OF NUMERICAL ANALYSIS, 1988, 8 (01) :141-148
[4]  
Bauschke H., 2011, Convex analysis and monotone operator theory in Hilbert spaces, DOI DOI 10.1007/978-3-319-48311-5
[5]  
Beck A, 2017, MOS-SIAM Series on Optimization, P1, DOI [DOI 10.1137/1.9781611974997, 10.1137/1.9781611974997]
[6]   A first order method for finding minimal norm-like solutions of convex optimization problems [J].
Beck, Amir ;
Sabach, Shoham .
MATHEMATICAL PROGRAMMING, 2014, 147 (1-2) :25-46
[7]   Combining approximation and exact penalty in hierarchical programming [J].
Bigi, Giancarlo ;
Lampariello, Lorenzo ;
Sagratella, Simone .
OPTIMIZATION, 2022, 71 (08) :2403-2419
[8]  
Borsos Z., 2020, Advances in neural information processing systems, V33, P14879, DOI [10.48550/arXiv.2006.03875, DOI 10.48550/ARXIV.2006.03875]
[9]   Proximal point algorithm controlled by a slowly vanishing term: Applications to hierarchical minimization [J].
Cabot, A .
SIAM JOURNAL ON OPTIMIZATION, 2005, 15 (02) :555-572
[10]  
Cao JC, 2024, Arxiv, DOI arXiv:2402.08097