Sample Complexity of Decentralized Tabular Q-Learning for Stochastic Games

被引:2
|
作者
Gao, Zuguang [1 ]
Ma, Qianqian [2 ]
Basar, Tamer [3 ]
Birge, John R. [1 ]
机构
[1] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
[2] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
[3] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
D O I
10.23919/ACC55779.2023.10155822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we carry out finite-sample analysis of decentralized Q-learning algorithms in the tabular setting for a significant subclass of general-sum stochastic games (SGs) - weakly acyclic SGs, which includes potential games and Markov team problems as special cases. In the practical while challenging decentralized setting, neither the rewards nor the actions of other agents can be observed by each agent. In fact, each agent can be completely oblivious to the presence of other decision makers. In this work, the sample complexity of the decentralized tabular Q-learning algorithm in [1] to converge to a Markov perfect equilibrium is developed.
引用
收藏
页码:1098 / 1103
页数:6
相关论文
共 50 条
  • [41] Switching Q-learning using golden cross on hunter games
    Ikimi, Taisuke
    Matsumoto, Keinosuke
    Mori, Naoki
    IEEJ Transactions on Electronics, Information and Systems, 2014, 134 (09) : 1318 - 1324
  • [42] Decentralized Cognitive MAC Protocol Design Based on POMDP and Q-Learning
    Lan, Zhongli
    Jiang, Hong
    Wu, Xiaoli
    2012 7TH INTERNATIONAL ICST CONFERENCE ON COMMUNICATIONS AND NETWORKING IN CHINA (CHINACOM), 2012, : 548 - 551
  • [43] A Consensus Q-Learning Approach for Decentralized Control of Shared Energy Storage
    Joshi, Amit
    Tipaldi, Massimo
    Glielmo, Luigi
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 3447 - 3452
  • [44] Large-scale tabular-form hardware architecture for Q-learning with delays
    Liu, Zhenzehn
    Elhanany, Itamar
    2007 50TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 2007, : 687 - 690
  • [45] Optimal Deception Asset Deployment in Cybersecurity: A Nash Q-Learning Approach in Multi-Agent Stochastic Games
    Kong, Guanhua
    Chen, Fucai
    Yang, Xiaohan
    Cheng, Guozhen
    Zhang, Shuai
    He, Weizhen
    APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [46] Delay-Aware Decentralized Q-learning for Wind Farm Control
    Monroc, Claire Bizon
    Bouba, Eva
    Busic, Ana
    Dubuc, Donatien
    Zhu, Jiamin
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 807 - 813
  • [47] Q-LEARNING
    WATKINS, CJCH
    DAYAN, P
    MACHINE LEARNING, 1992, 8 (3-4) : 279 - 292
  • [48] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
  • [49] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
    Wang, Yin-Hao
    Li, Tzuu-Hseng S.
    Lin, Chih-Jui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
  • [50] Functional Systems Network Outperforms Q-learning in Stochastic Environment
    Sorokin, Artyom Y.
    Burtsev, Mikhail S.
    7TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, (BICA 2016), 2016, 88 : 397 - 402