Enhancing Multi-agent Coordination via Dual-channel Consensus

被引：0

作者：

Zhang, Qingyang ^{[1
,2
]}

Wang, Kaishen ^{[1
,3
]}

Ruan, Jingqing ^{[1
,2
]}

Yang, Yiming ^{[1
]}

Xing, Dengpeng ^{[1
,3
]}

Xu, Bo ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China

[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

来源：

MACHINE INTELLIGENCE RESEARCH | 2024年 / 21卷 / 02期

关键词：

Multi-agent reinforcement learning; contrastive representation learning; consensus; multi-agent cooperation; cognitive consistency; COMMUNICATION; SYSTEMS;

D O I：

10.1007/s11633-023-1464-2

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Successful coordination in multi-agent systems requires agents to achieve consensus. Previous works propose methods through information sharing, such as explicit information sharing via communication protocols or exchanging information implicitly via behavior prediction. However, these methods may fail in the absence of communication channels or due to biased modeling. In this work, we propose to develop dual-channel consensus (DuCC) via contrastive representation learning for fully cooperative multi-agent systems, which does not need explicit communication and avoids biased modeling. DuCC comprises two types of consensus: temporally extended consensus within each agent (inner-agent consensus) and mutual consensus across agents (inter-agent consensus). To achieve DuCC, we design two objectives to learn representations of slow environmental features for inner-agent consensus and to realize cognitive consistency as inter-agent consensus. Our DuCC is highly general and can be flexibly combined with various MARL algorithms. The extensive experiments on StarCraft multi-agent challenge and Google research football demonstrate that our method efficiently reaches consensus and performs superiorly to state-of-the-art MARL algorithms.

引用

页码：349 / 368

页数：20

共 69 条

[51] Sunehag P, 2017, Arxiv, DOI arXiv:1706.05296
[52] Tian Z, 2020, AAAI CONF ARTIF INTE, V34, P7261
[53] van den Oord A, 2019, Arxiv, DOI arXiv:1807.03748
[54] Wang J. H., 2021, P 9 INT C LEARNING R
[55] Wang R. D., 2020, P 37 INT C MACHINE L
[56] Wang T., 2021, Proceedings of the 9th International Conference on Learning Representations
[57] Wang T., 2020, P 8 INT C LEARNING R
[58] Learning invariance manifolds
Wiskott, L
[J]. NEUROCOMPUTING, 1999, 26-7 : 925 - 932
[59] Slow feature analysis: Unsupervised learning of invariances
Wiskott, L
Sejnowski, TJ
[J]. NEURAL COMPUTATION, 2002, 14 (04) : 715 - 770
[60] Xiao Tete, 2021, P 9 INT C LEARNING R

← 1 2 3 4 5 6 7 →