Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

被引:0
|
作者
Qu, Guannan [1 ]
Lin, Yiheng [2 ]
Wierman, Adam [1 ]
Li, Na [3 ]
机构
[1] CALTECH, Pasadena, CA 91125 USA
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Harvard Univ, Cambridge, MA 02138 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷
关键词
COMPLEXITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner. Specifically, we propose a Scalable Actor-Critic (SAC) method that can learn a near optimal localized policy for optimizing the average reward with complexity scaling with the state-action space size of local neighborhoods, as opposed to the entire network. Our result centers around identifying and exploiting an exponential decay property that ensures the effect of agents on each other decays exponentially fast in their graph distance.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems
    Qu, Guannan
    Wierman, Adam
    Li, Na
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 256 - 266
  • [2] Multi-Agent Reinforcement Learning in Stochastic Networked Systems
    Lin, Yiheng
    Qu, Guannan
    Huang, Longbo
    Wierman, Adam
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [4] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Hoong Chuin Lau
    Zilberstein, Shlomo
    Zhang, Chongjie
    AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1341 - 1342
  • [5] Multi-Agent Reinforcement Learning with Reward Delays
    Zhang, Yuyang
    Zhang, Runyu
    Gu, Yuantao
    Li, Na
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [6] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M
    ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
  • [7] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M. (ohta@carc.aist.go.jp), (Springer Verlag):
  • [8] Networked Multi-Agent Reinforcement Learning in Continuous Spaces
    Zhang, Kaiqing
    Yang, Zhuoran
    Basar, Tamer
    2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2771 - 2776
  • [9] Rationality of reward sharing in multi-agent reinforcement learning
    Kazuteru Miyazaki
    Shigenobu Kobayashi
    New Generation Computing, 2001, 19 : 157 - 172
  • [10] Rationality of reward sharing in multi-agent reinforcement learning
    Miyazaki, K
    Kobayashi, S
    NEW GENERATION COMPUTING, 2001, 19 (02) : 157 - 172