Satisficing Paths and Independent Multiagent Reinforcement Learning in Stochastic Games

被引:1
|
作者
Yongacoglu, Bora [1 ]
Arslan, Gurdal [2 ]
Yuksel, Serdar [1 ]
机构
[1] Queens Univ, Dept Math & Stat, Kingston, ON, Canada
[2] Univ Hawaii Manoa, Dept Elect Engn, Honolulu, HI 96822 USA
来源
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE | 2023年 / 5卷 / 03期
关键词
multiagent reinforcement learning; independent learners; learning in games; stochastic games; decentralized systems; FICTITIOUS PLAY; UNCOUPLED DYNAMICS; CONVERGENCE; SYSTEMS; TEAMS; GO;
D O I
10.1137/22M1515112
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In multiagent reinforcement learning, independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For \epsilon \geq 0, an \epsilon -satisficing policy update rule is any rule that instructs the agent to not change its policy when it is \epsilon -best-responding to the policies of the remaining players; \epsilon -satisficing paths are defined to be sequences of joint policies obtained when each agent uses some \epsilon -satisficing policy update rule to select its next policy. We establish structural results on the existence of \epsilon -satisficing paths into \epsilon -equilibrium in both symmetric N-player games and general stochastic games with two players. We then present an independent learning algorithm for N-player symmetric games and give high probability guarantees of convergence to \epsilon -equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of \epsilon -satisficing paths.
引用
收藏
页码:745 / 773
页数:29
相关论文
共 50 条
  • [41] Learning in Games via Reinforcement and Regularization
    Mertikopoulos, Panayotis
    Sandholm, William H.
    MATHEMATICS OF OPERATIONS RESEARCH, 2016, 41 (04) : 1297 - 1324
  • [42] A study of multiagent reinforcement learning based on quantum theory
    Meng Xiangping
    Pi Yuzhen
    Yuan Quande
    Pan Ying
    2006 IMACS: MULTICONFERENCE ON COMPUTATIONAL ENGINEERING IN SYSTEMS APPLICATIONS, VOLS 1 AND 2, 2006, : 1990 - +
  • [43] ASN: action semantics network for multiagent reinforcement learning
    Tianpei Yang
    Weixun Wang
    Jianye Hao
    Matthew E. Taylor
    Yong Liu
    Xiaotian Hao
    Yujing Hu
    Yingfeng Chen
    Changjie Fan
    Chunxu Ren
    Ye Huang
    Jiangcheng Zhu
    Yang Gao
    Autonomous Agents and Multi-Agent Systems, 2023, 37
  • [44] Multiagent Reinforcement Learning Based Spectrum Sensing Policies for Cognitive Radio Networks
    Lunden, Jarmo
    Kulkarni, Sanjeev R.
    Koivunen, Visa
    Poor, H. Vincent
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2013, 7 (05) : 858 - 868
  • [45] Constrained Multiagent Reinforcement Learning for Large Agent Population
    Ling, Jiajing
    Singh, Arambam James
    Thien, Nguyen Duc
    Kumar, Akshat
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 183 - 199
  • [46] Reinforcement learning and stochastic optimisation
    Jaimungal, Sebastian
    FINANCE AND STOCHASTICS, 2022, 26 (01) : 103 - 129
  • [47] ASN: action semantics network for multiagent reinforcement learning
    Yang, Tianpei
    Wang, Weixun
    Hao, Jianye
    Taylor, Matthew E.
    Liu, Yong
    Hao, Xiaotian
    Hu, Yujing
    Chen, Yingfeng
    Fan, Changjie
    Ren, Chunxu
    Huang, Ye
    Zhu, Jiangcheng
    Gao, Yang
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2023, 37 (02)
  • [48] Domain-Aware Multiagent Reinforcement Learning in Navigation
    Saeed, Ifrah
    Cullen, Andrew C.
    Erfani, Sarah
    Alpcan, Tansu
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [49] Multiagent Reinforcement Learning: Spiking and Nonspiking Agents in the Iterated Prisoner's Dilemma
    Vassiliades, Vassilis
    Cleanthous, Aristodemos
    Christodoulou, Chris
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (04): : 639 - 653
  • [50] Explicitly Learning Policy Under Partial Observability in Multiagent Reinforcement Learning
    Yang, Chen
    Yang, Guangkai
    Chen, Hao
    Zhang, Junge
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,