Gradient Descent Ascent for Minimax Problems on Riemannian Manifolds

被引:5
|
作者
Huang, Feihu [1 ,2 ]
Gao, Shangqian [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 210016, Jiangsu, Peoples R China
[2] MIIT Key Lab Pattern Anal & Machine Intelligence, Nanjing 211106, Jiangsu, Peoples R China
[3] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
关键词
Manifolds; Optimization; Training; Machine learning; Complexity theory; Principal component analysis; Neural networks; Deep neural networks; minimax optimization; riemanian manifolds; robust optimization; stiefel manifold; MONOTONE VECTOR-FIELDS; ALGORITHM;
D O I
10.1109/TPAMI.2023.3234160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the paper, we study a class of useful minimax problems on Riemanian manifolds and propose a class of effective Riemanian gradient-based methods to solve these minimax problems. Specifically, we propose an effective Riemannian gradient descent ascent (RGDA) algorithm for the deterministic minimax optimization. Moreover, we prove that our RGDA has a sample complexity of O(kappa(2)epsilon(-2)) for finding an epsilon-stationary solution of the Geodesically-Nonconvex Strongly-Concave (GNSC) minimax problems, where kappa denotes the condition number. At the same time, we present an effective Riemannian stochastic gradient descent ascent (RSGDA) algorithm for the stochastic minimax optimization, which has a sample complexity of O(kappa(4)epsilon(-4)) for finding an epsilon-stationary solution. To further reduce the sample complexity, we propose an accelerated Riemannian stochastic gradient descent ascent (Acc-RSGDA) algorithm based on the momentum-based variance-reduced technique. We prove that our Acc-RSGDA algorithm achieves a lower sample complexity of (O) over tilde(kappa(4)epsilon(-3)) in searching for an epsilon-stationary solution of the GNSC minimax problems. Extensive experimental results on the robust distributional optimization and robust Deep Neural Networks (DNNs) training over Stiefel manifold demonstrate efficiency of our algorithms.
引用
收藏
页码:8466 / 8476
页数:11
相关论文
共 50 条