Decentralized Q-Learning for Weakly Acyclic Stochastic Dynamic Games

被引：0

作者：

Arslan, Gurdal ^{[1
]}

Yuksel, Serdar ^{[2
]}

机构：

[1] Univ Hawaii Manoa, Dept Elect Engn, 440 Holmes Hall,2540 Dole St, Honolulu, HI 96822 USA

[2] Queens Univ, Dept Math & Stat, Kingston, ON K7L 3N6, Canada

来源：

2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC) | 2015年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

There are only a few learning algorithms applicable to stochastic dynamic games. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In the case of dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic dynamic games, and study their convergence for the weakly acyclic case. We show that the decision makers employing these algorithms would eventually be using equilibrium policies almost surely in large classes of stochastic dynamic games.

引用

页码：6743 / 6748

页数：6

共 50 条

[1] Decentralized Q-Learning for Stochastic Teams and Games
Arslan, Gurdal
Yuksel, Serdar
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (04) : 1545 - 1558
[2] Asynchronous Decentralized Q-Learning in Stochastic Games
Yongacoglu, Bora
Arslan, Gurdal
Yuksel, Serdar
2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 5008 - 5013
[3] Decentralized Q-Learning with Constant Aspirations in Stochastic Games
Yongacoglu, Bora
Arslan, Gurdal
Yuksel, Serdar
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1744 - 1749
[4] Sample Complexity of Decentralized Tabular Q-Learning for Stochastic Games
Gao, Zuguang
Ma, Qianqian
Basar, Tamer
Birge, John R.
2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 1098 - 1103
[5] Decentralized Q-Learning in Zero-sum Markov Games
Sayin, Muhammed O.
Zhang, Kaiqing
Leslie, David S.
Sar, Tamer Ba Comma
Ozdaglar, Asuman
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[6] Nash Q-learning for general-sum stochastic games
Hu, JL
Wellman, MP
JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1039 - 1069
[7] A Novel Heuristic Q-Learning Algorithm for Solving Stochastic Games
Li, Jianwei
Liu, Weiyi
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1135 - 1144
[8] Selectively Decentralized Q-Learning
Thanh Nguyen
Mukhopadhyay, Snehasis
2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 328 - 333
[9] Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games
Amhraoui, Elmehdi
Masrour, Tawfik
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 2781 - 2797
[10] Balancing Two-Player Stochastic Games with Soft Q-Learning
Grau-Moya, Jordi
Leibfried, Felix
Bou-Ammar, Haitham
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 268 - 274

← 1 2 3 4 5 →