Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization

被引：0

作者：

Zhang, Guodong ^{[1
]}

Wang, Yuanhao ^{[2
]}

Lessard, Laurent ^{[3
]}

Grosse, Roger ^{[1
]}

机构：

[1] Univ Toronto, Toronto, ON, Canada

[2] Princeton Univ, Princeton, NJ 08544 USA

[3] Northeastern Univ, Boston, MA 02115 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151 | 2022年 / 151卷

关键词：

VARIATIONAL INEQUALITY; ALGORITHMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (SimGDA) in many settings. We prove that AltGDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the first result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate globally for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms.

引用

页数：21

共 50 条

[21] On the Convergence of Stochastic Compositional Gradient Descent Ascent Method
Gao, Hongchang
Wang, Xiaoqian
Luo, Lei
Shi, Xinghua
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2389 - 2395
[22] Convergence of Alternating Gradient Descent for Matrix Factorization
Ward, Rachel
Kolda, Tamara G.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[23] Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
Yan, Yan
Xu, Yi
Lin, Qihang
Liu, Wei
Yang, Tianbao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[24] GRAND: A Gradient-Related Ascent and Descent Algorithmic Framework for Minimax Problems
Niu, Xiaochun
Wei, Ermin
2022 58TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2022,
[25] A minimax near-optimal algorithm for adaptive rejection sampling
Achddou, Juliette
Lam-Weil, Joseph
Carpentier, Alexandra
Blanchard, Gilles
ALGORITHMIC LEARNING THEORY, VOL 98, 2019, 98
[26] Parslo: A Gradient Descent-based Approach for Near-optimal Partial SLO Allotment in Microservices
Mirhosseini, Amirhossein
Elnikety, Sameh
Wenisch, Thomas F.
PROCEEDINGS OF THE 2021 ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '21), 2021, : 442 - 457
[27] NEAR-OPTIMAL CONVERGENCE OF THE FULL ORTHOGONALIZATION METHOD
Chen, Tyler
Meurant, Gérard
Electronic Transactions on Numerical Analysis, 2024, 60 : 421 - 427
[28] Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems
Luo, Luo
Ye, Haishan
Huang, Zhichao
Zhang, Tong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[29] On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum
Gao, Hongchang
Li, Junyi
Huang, Heng
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[30] Near-Optimal Straggler Mitigation for Distributed Gradient Methods
Li, Songze
Kalan, Seyed Mohammadreza Mousavi
Avestimehr, A. Salman
Soltanolkotabi, Mahdi
2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 857 - 866

← 1 2 3 4 5 →