Sample Complexity of Decentralized Tabular Q-Learning for Stochastic Games

被引:2
|
作者
Gao, Zuguang [1 ]
Ma, Qianqian [2 ]
Basar, Tamer [3 ]
Birge, John R. [1 ]
机构
[1] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
[3] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
D O I
10.23919/ACC55779.2023.10155822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we carry out finite-sample analysis of decentralized Q-learning algorithms in the tabular setting for a significant subclass of general-sum stochastic games (SGs) - weakly acyclic SGs, which includes potential games and Markov team problems as special cases. In the practical while challenging decentralized setting, neither the rewards nor the actions of other agents can be observed by each agent. In fact, each agent can be completely oblivious to the presence of other decision makers. In this work, the sample complexity of the decentralized tabular Q-learning algorithm in [1] to converge to a Markov perfect equilibrium is developed.
引用
收藏
页码:1098 / 1103
页数:6
相关论文
共 50 条
  • [31] Implications of Decentralized Q-learning Resource Allocation in Wireless Networks
    Wilhelmi, Francesc
    Bellalta, Boris
    Cano, Cristina
    Jonsson, Anders
    2017 IEEE 28TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2017,
  • [32] A Generalized Minimax Q-Learning Algorithm for Two-Player Zero-Sum Stochastic Games
    Diddigi, Raghuram Bharadwaj
    Kamanchi, Chandramouli
    Bhatnagar, Shalabh
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (09) : 4816 - 4823
  • [33] Tabular Q-learning Based Reinforcement Learning Agent for Autonomous Vehicle Drift Initiation and Stabilization
    Toth, Szilard H.
    Bardos, Adam
    Viharos, Zsolt J.
    IFAC PAPERSONLINE, 2023, 56 (02): : 4896 - 4903
  • [34] Reinforcement Learning-Based Multihop Relaying: A Decentralized Q-Learning Approach
    Wang, Xiaowei
    Wang, Xin
    ENTROPY, 2021, 23 (10)
  • [35] A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment
    Hung, Shao-Ming
    Givigi, Sidney N.
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (01) : 186 - 197
  • [36] On Matrix Momentum Stochastic Approximation and Applications to Q-learning
    Devraj, Adithya M.
    Busic, Ana
    Meyn, Sean
    2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 749 - 756
  • [37] Multiple Model Q-Learning for Stochastic Asynchronous Rewards
    Campbell, Jeffrey S.
    Givigi, Sidney N.
    Schwartz, Howard M.
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2016, 81 (3-4) : 407 - 422
  • [38] Multiple Model Q-Learning for Stochastic Asynchronous Rewards
    Jeffrey S. Campbell
    Sidney N. Givigi
    Howard M. Schwartz
    Journal of Intelligent & Robotic Systems, 2016, 81 : 407 - 422
  • [39] Performing Deep Recurrent Double Q-Learning for Atari Games
    Moreno-Vera, Felipe
    2019 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2019, : 125 - 128
  • [40] Cooperation in evolutionary games incorporated with extended Q-learning algorithm
    Long, Pinduo
    Dai, Qionglin
    Li, Haihong
    Yang, Junzhong
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2025, 36 (03):