Value-Function-Based Sequential Minimization for Bi-Level Optimization

被引：4

作者：

Liu, Risheng ^{[1
,2
]}

Liu, Xuan ^{[1
,2
]}

Zeng, Shangzhi ^{[3
]}

Zhang, Jin ^{[4
,5
]}

Zhang, Yixuan ^{[6
]}

机构：

[1] Dalian Univ Technol, DUT RU Int Sch Informat Sci & Engn, Dalian 116024, Peoples R China

[2] Key Lab Ubiquitous Network & Serv Software Liaoni, Dalian 116024, Peoples R China

[3] Univ Victoria, Dept Math & Stat, Victoria, BC V8P 5C2, Canada

[4] Southern Univ Sci & Technol, Natl Ctr Appl Math Shenzhen, Dept Math, SUSTech Int Ctr Math, Shenzhen 518055, Peoples R China

[5] Peng Cheng Lab, Shenzhen 518055, Peoples R China

[6] Hong Kong Polytech Univ, Dept Appl Math, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 12期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Bi-level optimization; gradient-based method; hyper-parameter optimization; sequential minimization; value-function; MODEL;

D O I：

10.1109/TPAMI.2023.3303227

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Gradient-based Bi-Level Optimization (BLO) methods have been widely applied to handle modern learning tasks. However, most existing strategies are theoretically designed based on restrictive assumptions (e.g., convexity of the lower-level sub-problem), and computationally not applicable for high-dimensional tasks. Moreover, there are almost no gradient-based methods able to solve BLO in those challenging scenarios, such as BLO with functional constraints and pessimistic BLO. In this work, by reformulating BLO into approximated single-level problems, we provide a new algorithm, named Bi-level Value-Function-based Sequential Minimization (BVFSM), to address the above issues. Specifically, BVFSM constructs a series of value-function-based approximations, and thus avoids repeated calculations of recurrent gradient and Hessian inverse required by existing approaches, time-consuming especially for high-dimensional tasks. We also extend BVFSM to address BLO with additional functional constraints. More importantly, BVFSM can be used for the challenging pessimistic BLO, which has never been properly solved before. In theory, we prove the asymptotic convergence of BVFSM on these types of BLO, in which the restrictive lower-level convexity assumption is discarded. To our best knowledge, this is the first gradient-based algorithm that can solve different kinds of BLO (e.g., optimistic, pessimistic, and with constraints) with solid convergence guarantees. Extensive experiments verify the theoretical investigations and demonstrate our superiority on various real-world applications.

引用

页码：15930 / 15948

页数：19

共 52 条

[1] New concepts and an algorithm for multiobjective bilevel programming: optimistic, pessimistic and moderate solutions [J].

Alves, Maria Joao ;

Antunes, Carlos Henggeler ;

Costa, Joao Paulo .

OPERATIONAL RESEARCH, 2021, 21 (04) :2593-2626

[2]

Arjovsky M, 2017, PR MACH LEARN RES, V70

[3] Penalty and barrier methods: A unified framework [J].

Auslender, A .

SIAM JOURNAL ON OPTIMIZATION, 1999, 10 (01) :211-230

[4] AN EXPLICIT SOLUTION TO THE MULTILEVEL PROGRAMMING PROBLEM [J].

BARD, JF ;

FALK, JE .

COMPUTERS & OPERATIONS RESEARCH, 1982, 9 (01) :77-100

[5]

Bergstra J, 2012, J MACH LEARN RES, V13, P281

[6]

Bonnans J. F., 2013, Perturbation analysis of optimization problems

[7] A regularized smoothing method for fully parameterized convex problems with applications to convex and nonconvex two-stage stochastic programming [J].

Borges, Pedro ;

Sagastizabal, Claudia ;

Solodov, Mikhail .

MATHEMATICAL PROGRAMMING, 2021, 189 (1-2) :117-149

[8]

Boukari D., 1995, Optimization, V32, P301, DOI 10.1080/02331939508844053

[9] Alternating Minimization as Sequential Unconstrained Minimization: A Survey [J].

Byrne, Charles L. .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2013, 156 (03) :554-566

[10] Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation [J].

Chen, Xin ;

Xie, Lingxi ;

Wu, Jun ;

Tian, Qi .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1294-1303

← 1 2 3 4 5 6 →