Sample Complexity of Decentralized Tabular Q-Learning for Stochastic Games

被引:2
|
作者
Gao, Zuguang [1 ]
Ma, Qianqian [2 ]
Basar, Tamer [3 ]
Birge, John R. [1 ]
机构
[1] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
[3] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
D O I
10.23919/ACC55779.2023.10155822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we carry out finite-sample analysis of decentralized Q-learning algorithms in the tabular setting for a significant subclass of general-sum stochastic games (SGs) - weakly acyclic SGs, which includes potential games and Markov team problems as special cases. In the practical while challenging decentralized setting, neither the rewards nor the actions of other agents can be observed by each agent. In fact, each agent can be completely oblivious to the presence of other decision makers. In this work, the sample complexity of the decentralized tabular Q-learning algorithm in [1] to converge to a Markov perfect equilibrium is developed.
引用
收藏
页码:1098 / 1103
页数:6
相关论文
共 50 条
  • [1] Decentralized Q-Learning for Stochastic Teams and Games
    Arslan, Gurdal
    Yuksel, Serdar
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (04) : 1545 - 1558
  • [2] Asynchronous Decentralized Q-Learning in Stochastic Games
    Yongacoglu, Bora
    Arslan, Gurdal
    Yuksel, Serdar
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 5008 - 5013
  • [3] Decentralized Q-Learning with Constant Aspirations in Stochastic Games
    Yongacoglu, Bora
    Arslan, Gurdal
    Yuksel, Serdar
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1744 - 1749
  • [4] Decentralized Q-Learning for Weakly Acyclic Stochastic Dynamic Games
    Arslan, Gurdal
    Yuksel, Serdar
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 6743 - 6748
  • [5] Decentralized Q-Learning in Zero-sum Markov Games
    Sayin, Muhammed O.
    Zhang, Kaiqing
    Leslie, David S.
    Sar, Tamer Ba Comma
    Ozdaglar, Asuman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Sample Complexity of Kernel-Based Q-Learning
    Yeh, Sing-Yuan
    Chang, Fu-Chieh
    Yueh, Chang-Wei
    Wu, Pei-Yuan
    Bernacchia, Alberto
    Vakili, Sattar
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206 : 453 - 469
  • [7] Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning
    Li, Gen
    Cai, Changxiao
    Chen, Yuxin
    Gu, Yuantao
    Wei, Yuting
    Chi, Yuejie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [8] The Sample Complexity of Teaching-by-Reinforcement on Q-Learning
    Zhang, Xuezhou
    Bharti, Shubham Kumar
    Ma, Yuzhe
    Singla, Adish
    Zhu, Xiaojin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10939 - 10947
  • [9] Selectively Decentralized Q-Learning
    Thanh Nguyen
    Mukhopadhyay, Snehasis
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 328 - 333
  • [10] Nash Q-learning for general-sum stochastic games
    Hu, JL
    Wellman, MP
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1039 - 1069