Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication

被引：0

作者：

Xu, Chi ^{[1
]}

Zhang, Hui ^{[2
]}

Zhang, Ya ^{[2
]}

机构：

[1] Southeast Univ & Monash Univ, Joint Grad Sch Suzhou, Suzhou, Peoples R China

[2] Southeast Univ, Sch Automat, Nanjing, Peoples R China

来源：

2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC | 2023年

基金：

国家重点研发计划;

关键词：

multi-agent reinforcement learning; communication; distributed RL; self-attention mechanism;

D O I：

10.1109/CCDC58219.2023.10327314

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Centralized training distributed execution (CTDE) in multi-agent reinforcement learning (MARL) is a commonly used application paradigm. This paradigm usually assumes that the global state of the environment can be obtained during training, which is often difficult to satisfy in various scenarios due to constraints such as data transfer and processing power. Fully distributed multi-agent reinforcement learning algorithms do not depend on the knowledge of global state, with each agent trained independently and treating the remaining agents as part of the environment. However, applying single-agent algorithms to multi-agent systems faces the problem of non-smoothness of the environment and difficulty in forming effective collaborative strategies. In this paper, we propose a new method, Distributed Targeted Multi-Agent Communication (DTMAC), which makes each agent generate messages and pass them to other agents, explicitly enhancing the collaboration among individual agent and facilitating the formation of collaborative strategies. Experiments are given to illustrate the effectiveness of the method.

引用

页码：2915 / 2920

页数：6

共 21 条

[1]

Cho K., 2014, Learning phrase representations using RNN encoderdecoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

[2]

2014, DOI DOI 10.3115/V1/D14-1179

[3]

Das A., 2019, P 36 INT C MACH LEAR, P1538

[4]

de Witt C. S., 2020, ARXIV201109533

[5]

Foerster JN, 2016, ADV NEUR IN, V29

[6] A survey and critique of multiagent deep reinforcement learning [J].

Hernandez-Leal, Pablo ;

Kartal, Bilal ;

Taylor, Matthew E. .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (06) :750-797

[7]

Oliehoek F. A., 2016, A Concise Introduction to Decentralized POMDPs, V1

[8]

Papoudakis G., 2020, ARXIV200607869

[9] A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning [J].

Qu, Chengzhi ;

Gai, Wendong ;

Zhong, Maiying ;

Zhang, Jing .

APPLIED SOFT COMPUTING, 2020, 89

[10]

Samvelyan M, 2019, AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, P2186

← 1 2 3 →