Learning controlled and targeted communication with the centralized critic for the multi-agent system

被引:0
作者
Qingshuang Sun
Yuan Yao
Peng Yi
YuJiao Hu
Zhao Yang
Gang Yang
Xingshe Zhou
机构
[1] Northwestern Polytechnical University,School of Computer Science
[2] Purple Mountain Laboratories,Future network center
来源
Applied Intelligence | 2023年 / 53卷
关键词
Reinforcement learning; Centralized critic; Communication; Cooperation; Multi-agent system;
D O I
暂无
中图分类号
学科分类号
摘要
Multi-agent deep reinforcement learning (MDRL) has attracted attention for solving complex tasks. Two main challenges of MDRL are non-stationarity and partial observability from the perspective of agents, impacting the performance of agents’ learning cooperative policies. In this study, Controlled and Targeted Communication with the Centralized Critic (COTAC) is proposed, thereby constructing the paradigm of centralized learning and decentralized execution with partial communication. It is capable of decoupling how the MAS obtains environmental information during training and execution. Specifically, COTAC can make the environment faced by agents to be stationarity in the training phase and learn partial communication to overcome the limitation of partial observability in the execution phase. Based on this, decentralized actors learn controlled and targeted communication and policies optimized by centralized critics during training. As a result, agents comprehensively learn when to communicate during the sending and how to target information aggregation during the receiving. Apart from that, COTAC is evaluated on two multi-agent scenarios with continuous space. Experimental results demonstrated that partial agents with important information choose to send messages and targeted aggregate received information by identifying the relevant important information, which can still have better cooperation performance while reducing the communication traffic of the system.
引用
收藏
页码:14819 / 14837
页数:18
相关论文
共 63 条
[1]  
Chen F(2019)On the control of multi-agent systems: a survey Found Trends® Syst Control 6 339-499
[2]  
Ren W(2020)Development of a solution for adding a collaborative robot to an industrial agv Ind Rob:, Int J Rob Res Appl 47 723-735
[3]  
D’Souza F(2020)Communicating multi-uav system for cooperative slam-based exploration J Intell Rob Syst 98 325-343
[4]  
Costa J(2017)Deep reinforcement learning: a brief survey IEEE Signal Proc Mag 34 26-38
[5]  
Pires JN(2015)Deep learning Nature 521 436-444
[6]  
Mahdoui N(2019)A survey and critique of multiagent deep reinforcement learning Auton Agent Multi-Agent Syst 33 750-797
[7]  
Frémont V(2020)Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios Int J Rob Res 39 856-892
[8]  
Natalizio E(2021)Deep reinforcement learning for autonomous driving: a survey IEEE Trans Intell Transp Syst pp 1-18
[9]  
Arulkumaran K(2019)Grandmaster level in starcraft ii using multi-agent reinforcement learning Nature 575 350-354
[10]  
Deisenroth MP(2020)Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications IEEE Trans Cybern 50 3826-3839