Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization

被引:0
|
作者
Zhang, Guodong [1 ]
Wang, Yuanhao [2 ]
Lessard, Laurent [3 ]
Grosse, Roger [1 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Northeastern Univ, Boston, MA 02115 USA
关键词
VARIATIONAL INEQUALITY; ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (SimGDA) in many settings. We prove that AltGDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the first result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate globally for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] On the Convergence of Stochastic Compositional Gradient Descent Ascent Method
    Gao, Hongchang
    Wang, Xiaoqian
    Luo, Lei
    Shi, Xinghua
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2389 - 2395
  • [22] Convergence of Alternating Gradient Descent for Matrix Factorization
    Ward, Rachel
    Kolda, Tamara G.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [23] Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
    Yan, Yan
    Xu, Yi
    Lin, Qihang
    Liu, Wei
    Yang, Tianbao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [24] GRAND: A Gradient-Related Ascent and Descent Algorithmic Framework for Minimax Problems
    Niu, Xiaochun
    Wei, Ermin
    2022 58TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2022,
  • [25] A minimax near-optimal algorithm for adaptive rejection sampling
    Achddou, Juliette
    Lam-Weil, Joseph
    Carpentier, Alexandra
    Blanchard, Gilles
    ALGORITHMIC LEARNING THEORY, VOL 98, 2019, 98
  • [26] Parslo: A Gradient Descent-based Approach for Near-optimal Partial SLO Allotment in Microservices
    Mirhosseini, Amirhossein
    Elnikety, Sameh
    Wenisch, Thomas F.
    PROCEEDINGS OF THE 2021 ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '21), 2021, : 442 - 457
  • [27] NEAR-OPTIMAL CONVERGENCE OF THE FULL ORTHOGONALIZATION METHOD
    Chen, Tyler
    Meurant, Gérard
    Electronic Transactions on Numerical Analysis, 2024, 60 : 421 - 427
  • [28] Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems
    Luo, Luo
    Ye, Haishan
    Huang, Zhichao
    Zhang, Tong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [29] On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum
    Gao, Hongchang
    Li, Junyi
    Huang, Heng
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [30] Near-Optimal Straggler Mitigation for Distributed Gradient Methods
    Li, Songze
    Kalan, Seyed Mohammadreza Mousavi
    Avestimehr, A. Salman
    Soltanolkotabi, Mahdi
    2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 857 - 866