Decentralized Q-Learning for Weakly Acyclic Stochastic Dynamic Games

被引:0
|
作者
Arslan, Gurdal [1 ]
Yuksel, Serdar [2 ]
机构
[1] Univ Hawaii Manoa, Dept Elect Engn, 440 Holmes Hall,2540 Dole St, Honolulu, HI 96822 USA
[2] Queens Univ, Dept Math & Stat, Kingston, ON K7L 3N6, Canada
来源
2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC) | 2015年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There are only a few learning algorithms applicable to stochastic dynamic games. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In the case of dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic dynamic games, and study their convergence for the weakly acyclic case. We show that the decision makers employing these algorithms would eventually be using equilibrium policies almost surely in large classes of stochastic dynamic games.
引用
收藏
页码:6743 / 6748
页数:6
相关论文
共 50 条
  • [1] Decentralized Q-Learning for Stochastic Teams and Games
    Arslan, Gurdal
    Yuksel, Serdar
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (04) : 1545 - 1558
  • [2] Asynchronous Decentralized Q-Learning in Stochastic Games
    Yongacoglu, Bora
    Arslan, Gurdal
    Yuksel, Serdar
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 5008 - 5013
  • [3] Decentralized Q-Learning with Constant Aspirations in Stochastic Games
    Yongacoglu, Bora
    Arslan, Gurdal
    Yuksel, Serdar
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1744 - 1749
  • [4] Sample Complexity of Decentralized Tabular Q-Learning for Stochastic Games
    Gao, Zuguang
    Ma, Qianqian
    Basar, Tamer
    Birge, John R.
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 1098 - 1103
  • [5] Decentralized Q-Learning in Zero-sum Markov Games
    Sayin, Muhammed O.
    Zhang, Kaiqing
    Leslie, David S.
    Sar, Tamer Ba Comma
    Ozdaglar, Asuman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Nash Q-learning for general-sum stochastic games
    Hu, JL
    Wellman, MP
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1039 - 1069
  • [7] A Novel Heuristic Q-Learning Algorithm for Solving Stochastic Games
    Li, Jianwei
    Liu, Weiyi
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1135 - 1144
  • [8] Selectively Decentralized Q-Learning
    Thanh Nguyen
    Mukhopadhyay, Snehasis
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 328 - 333
  • [9] Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games
    Amhraoui, Elmehdi
    Masrour, Tawfik
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 2781 - 2797
  • [10] Balancing Two-Player Stochastic Games with Soft Q-Learning
    Grau-Moya, Jordi
    Leibfried, Felix
    Bou-Ammar, Haitham
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 268 - 274