On Convergence of Gradient Descent Ascent: A Tight Local Analysis

被引:0
|
作者
Li, Haochuan [1 ]
Farnia, Farzan [2 ]
Das, Subhro [3 ]
Jadbabaie, Ali [4 ]
机构
[1] MIT, Dept EECS, Cambridge, MA 02139 USA
[2] Chinese Univ Hong Kong, Dept CSE, Hong Kong, Peoples R China
[3] MIT, IBM Watson AI Lab, IBM Res, Cambridge, MA 02139 USA
[4] MIT, Dept CEE, Cambridge, MA 02139 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
EQUILIBRIA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
mainstream algorithms for minimax optimization in generative adversarial networks (GANs). Convergence properties of GDA have drawn significant interest in the recent literature. Specifically, for min(x) max(y) f (x; y) where f is strongly-concave in y and possibly nonconvex in x, (Lin et al., 2020) proved the convergence of GDA with a stepsize ratio eta(y)/eta(x) = Theta(kappa(2)) where eta(x) and eta(y) are the stepsizes for x and y and kappa is the condition number for y. While this stepsize ratio suggests a slow training of the min player, practical GAN algorithms typically adopt similar stepsizes for both variables, indicating a wide gap between theoretical and empirical results. In this paper, we aim to bridge this gap by analyzing the local convergence of general nonconvex-nonconcave minimax problems. We demonstrate that a step-size ratio of Theta(kappa) is necessary and sufficient for local convergence of GDA to a Stackelberg Equilibrium, where kappa is the local condition number for y. We prove a nearly tight convergence rate with a matching lower bound. We further extend the convergence guarantees to stochastic GDA and extra-gradient methods (EG). Finally, we conduct several numerical experiments to support our theoretical findings.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Local Stochastic Gradient Descent Ascent: Convergence Analysis and Communication Efficiency
    Deng, Yuyang
    Mandavi, Mehrdad
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [2] Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks
    Becker, Evan
    Pandit, Parthe
    Rangan, Sundeep
    Fletcher, Alyson K.
    FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 892 - 896
  • [3] A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates
    Arjevani, Yossi
    Shamir, Ohad
    Srebro, Nathan
    ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 111 - 132
  • [4] On the Convergence of Stochastic Compositional Gradient Descent Ascent Method
    Gao, Hongchang
    Wang, Xiaoqian
    Luo, Lei
    Shi, Xinghua
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2389 - 2395
  • [5] Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization
    Zhang, Guodong
    Wang, Yuanhao
    Lessard, Laurent
    Grosse, Roger
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [6] Tight Convergence Rate of Gradient Descent for Eigenvalue Computation
    Ding, Qinghua
    Zhou, Kaiwen
    Cheng, James
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3276 - 3282
  • [7] A Comprehensively Tight Analysis of Gradient Descent for PCA
    Xu, Zhiqiang
    Li, Ping
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum
    Gao, Hongchang
    Li, Junyi
    Huang, Heng
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [10] Convergence analysis of gradient descent stochastic algorithms
    Shapiro, A
    Wardi, Y
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1996, 91 (02) : 439 - 454