Satisficing Paths and Independent Multiagent Reinforcement Learning in Stochastic Games

被引：1

作者：

Yongacoglu, Bora ^{[1
]}

Arslan, Gurdal ^{[2
]}

Yuksel, Serdar ^{[1
]}

机构：

[1] Queens Univ, Dept Math & Stat, Kingston, ON, Canada

[2] Univ Hawaii Manoa, Dept Elect Engn, Honolulu, HI 96822 USA

来源：

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE | 2023年 / 5卷 / 03期

关键词：

multiagent reinforcement learning; independent learners; learning in games; stochastic games; decentralized systems; FICTITIOUS PLAY; UNCOUPLED DYNAMICS; CONVERGENCE; SYSTEMS; TEAMS; GO;

D O I：

10.1137/22M1515112

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In multiagent reinforcement learning, independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For \epsilon \geq 0, an \epsilon -satisficing policy update rule is any rule that instructs the agent to not change its policy when it is \epsilon -best-responding to the policies of the remaining players; \epsilon -satisficing paths are defined to be sequences of joint policies obtained when each agent uses some \epsilon -satisficing policy update rule to select its next policy. We establish structural results on the existence of \epsilon -satisficing paths into \epsilon -equilibrium in both symmetric N-player games and general stochastic games with two players. We then present an independent learning algorithm for N-player symmetric games and give high probability guarantees of convergence to \epsilon -equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of \epsilon -satisficing paths.

引用

页码：745 / 773

页数：29

共 50 条

[1] Multiagent Graphical Games With Inverse Reinforcement Learning
Donge, Vrushabh S.
Lian, Bosen
Lewis, Frank L.
Davoudi, Ali
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (02): : 841 - 852
[2] On Passivity, Reinforcement Learning, and Higher Order Learning in Multiagent Finite Games
Gao, Bolin
Pavel, Lacra
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (01) : 121 - 136
[3] Multiagent reinforcement learning in extensive form games with complete information
Akramizadeh, Ali
Menhaj, Mohammad-B.
Afshar, Ahmad
ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 205 - 211
[4] Two-Player Multiagent Graphical Games with Reinforcement Learning
Lian, Bosen
Wu, Jiacheng
2024 IEEE 7TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS 2024, 2024,
[5] Decentralized Q-Learning for Stochastic Teams and Games
Arslan, Gurdal
Yuksel, Serdar
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (04) : 1545 - 1558
[6] Nash-Minmax Strategy for Multiplayer Multiagent Graphical Games With Reinforcement Learning
Lian, Bosen
Xue, Wenqian
Lewis, Frank L.
Davoudi, Ali
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2025, 12 (01): : 763 - 775
[7] Leveraging Joint-Action Embedding in Multiagent Reinforcement Learning for Cooperative Games
Lou, Xingzhou
Zhang, Junge
Du, Yali
Yu, Chao
He, Zhaofeng
Huang, Kaiqi
IEEE TRANSACTIONS ON GAMES, 2024, 16 (02) : 470 - 482
[8] Exploring selfish reinforcement learning in repeated games with stochastic rewards
Verbeeck, Katja
Nowe, Ann
Parent, Johan
Tuyls, Karl
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2007, 14 (03) : 239 - 269
[9] Exploring selfish reinforcement learning in repeated games with stochastic rewards
Katja Verbeeck
Ann Nowé
Johan Parent
Karl Tuyls
Autonomous Agents and Multi-Agent Systems, 2007, 14 : 239 - 269
[10] A survey and critique of multiagent deep reinforcement learning
Hernandez-Leal, Pablo
Kartal, Bilal
Taylor, Matthew E.
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (06) : 750 - 797

← 1 2 3 4 5 →