Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

被引：0

作者：

Qu, Guannan ^{[1
]}

Lin, Yiheng ^{[2
]}

Wierman, Adam ^{[1
]}

Li, Na ^{[3
]}

机构：

[1] CALTECH, Pasadena, CA 91125 USA

[2] Tsinghua Univ, Beijing, Peoples R China

[3] Harvard Univ, Cambridge, MA 02138 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

关键词：

COMPLEXITY;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner. Specifically, we propose a Scalable Actor-Critic (SAC) method that can learn a near optimal localized policy for optimizing the average reward with complexity scaling with the state-action space size of local neighborhoods, as opposed to the entire network. Our result centers around identifying and exploiting an exponential decay property that ensures the effect of agents on each other decays exponentially fast in their graph distance.

引用

页数：13

共 50 条

[1] Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems
Qu, Guannan
Wierman, Adam
Li, Na
LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 256 - 266
[2] Multi-Agent Reinforcement Learning in Stochastic Networked Systems
Lin, Yiheng
Qu, Guannan
Huang, Longbo
Wierman, Adam
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[3] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
Duc Thien Nguyen
Yeoh, William
Lau, Hoong Chuin
Zilberstein, Shlomo
Zhang, Chongjie
PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
[4] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
Duc Thien Nguyen
Yeoh, William
Hoong Chuin Lau
Zilberstein, Shlomo
Zhang, Chongjie
AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1341 - 1342
[5] Multi-Agent Reinforcement Learning with Reward Delays
Zhang, Yuyang
Zhang, Runyu
Gu, Yuantao
Li, Na
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[6] Direct reward and indirect reward in multi-agent reinforcement learning
Ohta, M
ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
[7] Direct reward and indirect reward in multi-agent reinforcement learning
Ohta, M. (ohta@carc.aist.go.jp), (Springer Verlag):
[8] Networked Multi-Agent Reinforcement Learning in Continuous Spaces
Zhang, Kaiqing
Yang, Zhuoran
Basar, Tamer
2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2771 - 2776
[9] Rationality of reward sharing in multi-agent reinforcement learning
Kazuteru Miyazaki
Shigenobu Kobayashi
New Generation Computing, 2001, 19 : 157 - 172
[10] Rationality of reward sharing in multi-agent reinforcement learning
Miyazaki, K
Kobayashi, S
NEW GENERATION COMPUTING, 2001, 19 (02) : 157 - 172

← 1 2 3 4 5 →