Satisficing Paths and Independent Multiagent Reinforcement Learning in Stochastic Games

被引:1
|
作者
Yongacoglu, Bora [1 ]
Arslan, Gurdal [2 ]
Yuksel, Serdar [1 ]
机构
[1] Queens Univ, Dept Math & Stat, Kingston, ON, Canada
[2] Univ Hawaii Manoa, Dept Elect Engn, Honolulu, HI 96822 USA
来源
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE | 2023年 / 5卷 / 03期
关键词
multiagent reinforcement learning; independent learners; learning in games; stochastic games; decentralized systems; FICTITIOUS PLAY; UNCOUPLED DYNAMICS; CONVERGENCE; SYSTEMS; TEAMS; GO;
D O I
10.1137/22M1515112
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In multiagent reinforcement learning, independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For \epsilon \geq 0, an \epsilon -satisficing policy update rule is any rule that instructs the agent to not change its policy when it is \epsilon -best-responding to the policies of the remaining players; \epsilon -satisficing paths are defined to be sequences of joint policies obtained when each agent uses some \epsilon -satisficing policy update rule to select its next policy. We establish structural results on the existence of \epsilon -satisficing paths into \epsilon -equilibrium in both symmetric N-player games and general stochastic games with two players. We then present an independent learning algorithm for N-player symmetric games and give high probability guarantees of convergence to \epsilon -equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of \epsilon -satisficing paths.
引用
收藏
页码:745 / 773
页数:29
相关论文
共 50 条
  • [21] Multiagent Reinforcement Learning With Learning Automata for Microgrid Energy Management and Decision Optimization
    Fang, Xiaohan
    Wang, Jinkuan
    Yin, Chunhui
    Han, Yinghua
    Zhao, Qiang
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 779 - 784
  • [22] Prescribed-Time Optimal Consensus for Switched Stochastic Multiagent Systems: Reinforcement Learning Strategy
    Guang, Weiwei
    Wang, Xin
    Tan, Lihua
    Sun, Jian
    Huang, Tingwen
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 75 - 86
  • [23] Modeling opponent learning in multiagent repeated games
    Yudong Hu
    Congying Han
    Haoran Li
    Tiande Guo
    Applied Intelligence, 2023, 53 : 17194 - 17210
  • [24] Modeling opponent learning in multiagent repeated games
    Hu, Yudong
    Han, Congying
    Li, Haoran
    Guo, Tiande
    APPLIED INTELLIGENCE, 2023, 53 (13) : 17194 - 17210
  • [25] Reinforcement learning applied to games
    Crespo, Joao
    Wichert, Andreas
    SN APPLIED SCIENCES, 2020, 2 (05):
  • [26] Multiagent Reinforcement Learning With Unshared Value Functions
    Hu, Yujing
    Gao, Yang
    An, Bo
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) : 647 - 662
  • [28] Deep multiagent reinforcement learning: challenges and directions
    Wong, Annie
    Back, Thomas
    Kononova, Anna, V
    Plaat, Aske
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (06) : 5023 - 5056
  • [29] A survey and critique of multiagent deep reinforcement learning
    Pablo Hernandez-Leal
    Bilal Kartal
    Matthew E. Taylor
    Autonomous Agents and Multi-Agent Systems, 2019, 33 : 750 - 797
  • [30] Resilient Multiagent Reinforcement Learning With Function Approximation
    Ye, Lintao
    Figura, Martin
    Lin, Yixuan
    Pal, Mainak
    Das, Pranoy
    Liu, Ji
    Gupta, Vijay
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (12) : 8497 - 8512