Satisficing Paths and Independent Multiagent Reinforcement Learning in Stochastic Games

被引：1

作者：

Yongacoglu, Bora ^{[1
]}

Arslan, Gurdal ^{[2
]}

Yuksel, Serdar ^{[1
]}

机构：

[1] Queens Univ, Dept Math & Stat, Kingston, ON, Canada

[2] Univ Hawaii Manoa, Dept Elect Engn, Honolulu, HI 96822 USA

来源：

SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE | 2023年 / 5卷 / 03期

关键词：

multiagent reinforcement learning; independent learners; learning in games; stochastic games; decentralized systems; FICTITIOUS PLAY; UNCOUPLED DYNAMICS; CONVERGENCE; SYSTEMS; TEAMS; GO;

D O I：

10.1137/22M1515112

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In multiagent reinforcement learning, independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For \epsilon \geq 0, an \epsilon -satisficing policy update rule is any rule that instructs the agent to not change its policy when it is \epsilon -best-responding to the policies of the remaining players; \epsilon -satisficing paths are defined to be sequences of joint policies obtained when each agent uses some \epsilon -satisficing policy update rule to select its next policy. We establish structural results on the existence of \epsilon -satisficing paths into \epsilon -equilibrium in both symmetric N-player games and general stochastic games with two players. We then present an independent learning algorithm for N-player symmetric games and give high probability guarantees of convergence to \epsilon -equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of \epsilon -satisficing paths.

引用

页码：745 / 773

页数：29

共 50 条

[21] Multiagent Reinforcement Learning With Learning Automata for Microgrid Energy Management and Decision Optimization
Fang, Xiaohan
Wang, Jinkuan
Yin, Chunhui
Han, Yinghua
Zhao, Qiang
PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 779 - 784
[22] Prescribed-Time Optimal Consensus for Switched Stochastic Multiagent Systems: Reinforcement Learning Strategy
Guang, Weiwei
Wang, Xin
Tan, Lihua
Sun, Jian
Huang, Tingwen
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 75 - 86
[23] Modeling opponent learning in multiagent repeated games
Yudong Hu
Congying Han
Haoran Li
Tiande Guo
Applied Intelligence, 2023, 53 : 17194 - 17210
[24] Modeling opponent learning in multiagent repeated games
Hu, Yudong
Han, Congying
Li, Haoran
Guo, Tiande
APPLIED INTELLIGENCE, 2023, 53 (13) : 17194 - 17210
[25] Reinforcement learning applied to games
Crespo, Joao
Wichert, Andreas
SN APPLIED SCIENCES, 2020, 2 (05):
[26] Multiagent Reinforcement Learning With Unshared Value Functions
Hu, Yujing
Gao, Yang
An, Bo
IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) : 647 - 662
[27] Multiagent Reinforcement Learning: Rollout and Policy Iteration
Bertsekas, Dimitri
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (02) : 249 - 272
[28] Deep multiagent reinforcement learning: challenges and directions
Wong, Annie
Back, Thomas
Kononova, Anna, V
Plaat, Aske
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (06) : 5023 - 5056
[29] A survey and critique of multiagent deep reinforcement learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
Autonomous Agents and Multi-Agent Systems, 2019, 33 : 750 - 797
[30] Resilient Multiagent Reinforcement Learning With Function Approximation
Ye, Lintao
Figura, Martin
Lin, Yixuan
Pal, Mainak
Das, Pranoy
Liu, Ji
Gupta, Vijay
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (12) : 8497 - 8512

← 1 2 3 4 5 →