Distributed reinforcement learning in multi-agent networks

被引:0
作者
Kar, Soummya [1 ]
Moura, Jose M. F. [1 ]
Poor, H. Vincent [2 ]
机构
[1] Carnegie Mellon Univ, Dept ECE, Pittsburgh, PA 15213 USA
[2] Princeton Univ, Dept EE, Princeton, NJ 08544 USA
来源
2013 IEEE 5TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2013) | 2013年
基金
美国国家科学基金会;
关键词
Multi-agent stochastic control; distributed Q-learning; reinforcement learning; collaborative network processing; consensus plus innovations; distributed stochastic approximation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed reinforcement learning algorithms for collaborative multi-agent Markov decision processes (MDPs) are presented and analyzed. The networked setup consists of a collection of agents (learners) which respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. With the objective of jointly learning the optimal stationary control policy (in the absence of global state transition and local agent cost statistics) that minimizes network-averaged infinite horizon discounted cost, the paper presents distributed variants of Q-learning of the consensus + innovations type in which each agent sequentially refines its learning parameters by locally processing its instantaneous payoff data and the information received from neighboring agents. Under broad conditions on the multi-agent decision model and mean connectivity of the inter-agent communication network, the proposed distributed algorithms are shown to achieve optimal learning asymptotically, i. e., almost surely (a. s.) each network agent is shown to learn the value function and the optimal stationary control policy of the collaborative MDP asymptotically. Further, convergence rate estimates for the proposed class of distributed learning algorithms are obtained.
引用
收藏
页码:296 / +
页数:2
相关论文
共 50 条
  • [31] Multi-agent deep reinforcement learning: a survey
    Gronauer, Sven
    Diepold, Klaus
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
  • [32] Automatic partitioning for multi-agent reinforcement learning
    Sun, R
    Peterson, T
    ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 268 - 271
  • [33] Reinforcement Learning for Multi-Agent Competitive Scenarios
    Coutinho, Manuel
    Reis, Luis Paulo
    2022 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2022, : 130 - 135
  • [34] Distributed Multi-Cloud Multi-Access Edge Computing by Multi-Agent Reinforcement Learning
    Zhang, Yutong
    Di, Boya
    Zheng, Zijie
    Lin, Jinlong
    Song, Lingyang
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2021, 20 (04) : 2565 - 2578
  • [35] Specification Aware Multi-Agent Reinforcement Learning
    Ritz, Fabian
    Phan, Thomy
    Mueller, Robert
    Gabor, Thomas
    Sedlmeier, Andreas
    Zeller, Marc
    Wieghardt, Jan
    Schmid, Reiner
    Sauer, Horst
    Klein, Cornel
    Linnhoff-Popien, Claudia
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2021, 2022, 13251 : 3 - 21
  • [36] Multi-agent deep reinforcement learning: a survey
    Sven Gronauer
    Klaus Diepold
    Artificial Intelligence Review, 2022, 55 : 895 - 943
  • [37] A Review of Multi-Agent Reinforcement Learning Algorithms
    Liang, Jiaxin
    Miao, Haotian
    Li, Kai
    Tan, Jianheng
    Wang, Xi
    Luo, Rui
    Jiang, Yueqiu
    ELECTRONICS, 2025, 14 (04):
  • [38] SCM network with multi-agent reinforcement learning
    Zhao, Gang
    Sun, Ruoying
    FIFTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS 1-3, 2006, : 1294 - 1300
  • [39] Multi-Agent Reinforcement Learning for Highway Platooning
    Kolat, Mate
    Becsi, Tamas
    ELECTRONICS, 2023, 12 (24)
  • [40] TEAM POLICY LEARNING FOR MULTI-AGENT REINFORCEMENT LEARNING
    Cassano, Lucas
    Alghunaim, Sulaiman A.
    Sayed, Ali H.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3062 - 3066