Distributed reinforcement learning in multi-agent networks

被引：0

作者：

Kar, Soummya ^{[1
]}

Moura, Jose M. F. ^{[1
]}

Poor, H. Vincent ^{[2
]}

机构：

[1] Carnegie Mellon Univ, Dept ECE, Pittsburgh, PA 15213 USA

[2] Princeton Univ, Dept EE, Princeton, NJ 08544 USA

来源：

2013 IEEE 5TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2013) | 2013年

基金：

美国国家科学基金会;

关键词：

Multi-agent stochastic control; distributed Q-learning; reinforcement learning; collaborative network processing; consensus plus innovations; distributed stochastic approximation;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed reinforcement learning algorithms for collaborative multi-agent Markov decision processes (MDPs) are presented and analyzed. The networked setup consists of a collection of agents (learners) which respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. With the objective of jointly learning the optimal stationary control policy (in the absence of global state transition and local agent cost statistics) that minimizes network-averaged infinite horizon discounted cost, the paper presents distributed variants of Q-learning of the consensus + innovations type in which each agent sequentially refines its learning parameters by locally processing its instantaneous payoff data and the information received from neighboring agents. Under broad conditions on the multi-agent decision model and mean connectivity of the inter-agent communication network, the proposed distributed algorithms are shown to achieve optimal learning asymptotically, i. e., almost surely (a. s.) each network agent is shown to learn the value function and the optimal stationary control policy of the collaborative MDP asymptotically. Further, convergence rate estimates for the proposed class of distributed learning algorithms are obtained.

引用

页码：296 / +

页数：2

共 50 条

[21] Multi-Agent Deep Reinforcement Learning for Coordinated Multipoint in Mobile Networks
Schneider, Stefan
Karl, Holger
Khalili, Ramin
Hecker, Artur
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (01): : 908 - 924
[22] Deep Reinforcement Learning for Multi-Agent Power Control in Heterogeneous Networks
Zhang, Lin
Liang, Ying-Chang
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2021, 20 (04) : 2551 - 2564
[23] Multi-Agent Reinforcement Learning-Based Distributed Dynamic Spectrum Access
Albinsaid, Hasan
Singh, Keshav
Biswas, Sudip
Li, Chih-Peng
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (02) : 1174 - 1185
[24] Distributed policy evaluation via inexact ADMM in multi-agent reinforcement learning
Xiaoxiao Zhao
Peng Yi
Li Li
Control Theory and Technology, 2020, 18 : 362 - 378
[25] A Distributed Multi-Agent Dynamic Area Coverage Algorithm Based on Reinforcement Learning
Xiao, Jian
Wang, Gang
Zhang, Ying
Cheng, Lei
IEEE ACCESS, 2020, 8 : 33511 - 33521
[26] Cooperative Reinforcement Learning Algorithm to Distributed Power System Based on Multi-Agent
Gao, La-mei
Zeng, Jun
Wu, Jie
Li, Min
2009 3RD INTERNATIONAL CONFERENCE ON POWER ELECTRONICS SYSTEMS AND APPLICATIONS: ELECTRIC VEHICLE AND GREEN ENERGY, 2009, : 53 - 53
[27] Urban Traffic Control Using Distributed Multi-agent Deep Reinforcement Learning
Kitagawa, Shunya
Moustafa, Ahmed
Ito, Takayuki
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2019, 11672 : 337 - 349
[28] Distributed policy evaluation via inexact ADMM in multi-agent reinforcement learning
Zhao, Xiaoxiao
Yi, Peng
Li, Li
CONTROL THEORY AND TECHNOLOGY, 2020, 18 (04) : 362 - 378
[29] Generalized learning automata for multi-agent reinforcement learning
De Hauwere, Yann-Michael
Vrancx, Peter
Nowe, Ann
AI COMMUNICATIONS, 2010, 23 (04) : 311 - 324
[30] Multi-agent reinforcement learning: weighting and partitioning
Sun, R
Peterson, T
NEURAL NETWORKS, 1999, 12 (4-5) : 727 - 753

← 1 2 3 4 5 →