Distributed reinforcement learning in multi-agent networks

被引：0

作者：

Kar, Soummya ^{[1
]}

Moura, Jose M. F. ^{[1
]}

Poor, H. Vincent ^{[2
]}

机构：

[1] Carnegie Mellon Univ, Dept ECE, Pittsburgh, PA 15213 USA

[2] Princeton Univ, Dept EE, Princeton, NJ 08544 USA

来源：

2013 IEEE 5TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2013) | 2013年

基金：

美国国家科学基金会;

关键词：

Multi-agent stochastic control; distributed Q-learning; reinforcement learning; collaborative network processing; consensus plus innovations; distributed stochastic approximation;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed reinforcement learning algorithms for collaborative multi-agent Markov decision processes (MDPs) are presented and analyzed. The networked setup consists of a collection of agents (learners) which respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. With the objective of jointly learning the optimal stationary control policy (in the absence of global state transition and local agent cost statistics) that minimizes network-averaged infinite horizon discounted cost, the paper presents distributed variants of Q-learning of the consensus + innovations type in which each agent sequentially refines its learning parameters by locally processing its instantaneous payoff data and the information received from neighboring agents. Under broad conditions on the multi-agent decision model and mean connectivity of the inter-agent communication network, the proposed distributed algorithms are shown to achieve optimal learning asymptotically, i. e., almost surely (a. s.) each network agent is shown to learn the value function and the optimal stationary control policy of the collaborative MDP asymptotically. Further, convergence rate estimates for the proposed class of distributed learning algorithms are obtained.

引用

页码：296 / +

页数：2

共 50 条

[31] Multi-agent deep reinforcement learning: a survey
Gronauer, Sven
Diepold, Klaus
ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
[32] Automatic partitioning for multi-agent reinforcement learning
Sun, R
Peterson, T
ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 268 - 271
[33] Reinforcement Learning for Multi-Agent Competitive Scenarios
Coutinho, Manuel
Reis, Luis Paulo
2022 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2022, : 130 - 135
[34] Distributed Multi-Cloud Multi-Access Edge Computing by Multi-Agent Reinforcement Learning
Zhang, Yutong
Di, Boya
Zheng, Zijie
Lin, Jinlong
Song, Lingyang
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2021, 20 (04) : 2565 - 2578
[35] Specification Aware Multi-Agent Reinforcement Learning
Ritz, Fabian
Phan, Thomy
Mueller, Robert
Gabor, Thomas
Sedlmeier, Andreas
Zeller, Marc
Wieghardt, Jan
Schmid, Reiner
Sauer, Horst
Klein, Cornel
Linnhoff-Popien, Claudia
AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2021, 2022, 13251 : 3 - 21
[36] Multi-agent deep reinforcement learning: a survey
Sven Gronauer
Klaus Diepold
Artificial Intelligence Review, 2022, 55 : 895 - 943
[37] A Review of Multi-Agent Reinforcement Learning Algorithms
Liang, Jiaxin
Miao, Haotian
Li, Kai
Tan, Jianheng
Wang, Xi
Luo, Rui
Jiang, Yueqiu
ELECTRONICS, 2025, 14 (04):
[38] SCM network with multi-agent reinforcement learning
Zhao, Gang
Sun, Ruoying
FIFTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS 1-3, 2006, : 1294 - 1300
[39] Multi-Agent Reinforcement Learning for Highway Platooning
Kolat, Mate
Becsi, Tamas
ELECTRONICS, 2023, 12 (24)
[40] TEAM POLICY LEARNING FOR MULTI-AGENT REINFORCEMENT LEARNING
Cassano, Lucas
Alghunaim, Sulaiman A.
Sayed, Ali H.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3062 - 3066

← 1 2 3 4 5 →