Gradient Play in Stochastic Games: Stationary Points, Convergence, and Sample Complexity

被引:1
|
作者
Zhang, Runyu [1 ]
Ren, Zhaolin [1 ]
Li, Na [1 ]
机构
[1] Harvard Univ, Harvard Sch Engn & Appl Sci, Cambridge, MA 02138 USA
关键词
Games; Convergence; Stochastic processes; Nash equilibrium; Geometry; Complexity theory; Stability analysis; Multiagent systems; reinforcement learning;
D O I
10.1109/TAC.2024.3387208
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we study the performance of the gradient play algorithm for stochastic games (SGs), where each agent tries to maximize its own total discounted reward by making decisions independently based on current state information, which is shared between agents. Policies are directly parameterized by the probability of choosing a certain action at a given state. We show that Nash equilibria (NEs) and first-order stationary policies are equivalent in this setting, and give a local convergence rate around strict NEs. Furthermore, for a subclass of SGs called Markov potential games (which includes the setting with identical rewards as an important special case), we design a sample-based reinforcement learning algorithm and give a nonasymptotic global convergence rate analysis for both exact gradient play and our sample-based learning algorithm. Our result shows that the number of iterations to reach an epsilon-NE scales linearly, instead of exponentially, with the number of agents. Local geometry and local stability are also considered, where we prove that strict NEs are local maxima of the total potential function and fully mixed NEs are saddle points.
引用
收藏
页码:6499 / 6514
页数:16
相关论文
共 50 条
  • [1] Gradient Play in Stochastic Games: Stationary Points and Local Geometry
    Zhang, Runyu
    Ren, Zhaolin
    Li, Na
    IFAC PAPERSONLINE, 2022, 55 (30): : 73 - 78
  • [2] Convergence and Sample Complexity of Policy Gradient Methods for Stabilizing Linear Systems
    Zhao, Feiran
    Fu, Xingyun
    You, Keyou
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2025, 70 (03) : 1455 - 1466
  • [3] Linear Convergence of Independent Natural Policy Gradient in Games With Entropy Regularization
    Sun, Youbang
    Liu, Tao
    Kumar, P. R.
    Shahrampour, Shahin
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 1217 - 1222
  • [4] Geometric Convergence of Gradient Play Algorithms for Distributed Nash Equilibrium Seeking
    Tatarenko, Tatiana
    Shi, Wei
    Nedic, Angelia
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (11) : 5342 - 5353
  • [5] DEEP FICTITIOUS PLAY FOR STOCHASTIC DIFFERENTIAL GAMES
    Hu, Ruimeng
    COMMUNICATIONS IN MATHEMATICAL SCIENCES, 2021, 19 (02) : 325 - 353
  • [6] Smooth sample average approximation of stationary points in nonsmooth stochastic optimization and applications
    Xu, Huifu
    Zhang, Dali
    MATHEMATICAL PROGRAMMING, 2009, 119 (02) : 371 - 401
  • [7] Recursive Reasoning With Reduced Complexity and Intermittency for Nonequilibrium Learning in Stochastic Games
    Fotiadis, Filippos
    Vamvoudakis, Kyriakos G.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 8467 - 8481
  • [8] Convergence of Policy Gradient Methods for Nash Equilibria in General-sum Stochastic Games
    Chen, Yan
    Li, Tao
    IFAC PAPERSONLINE, 2023, 56 (02): : 3435 - 3440
  • [9] On the Convergence of Fictitious Play in Channel Selection Games
    Perlaza, S. M.
    Quintero-Florez, V.
    Tembine, H.
    Lasaulce, S.
    IEEE LATIN AMERICA TRANSACTIONS, 2011, 9 (04) : 470 - 476
  • [10] Novel Convergence Results of Adaptive Stochastic Gradient Descents
    Sun, Tao
    Qiao, Linbo
    Liao, Qing
    Li, Dongsheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1044 - 1056