Enhancing Multi-agent Coordination via Dual-channel Consensus

被引:0
作者
Zhang, Qingyang [1 ,2 ]
Wang, Kaishen [1 ,3 ]
Ruan, Jingqing [1 ,2 ]
Yang, Yiming [1 ]
Xing, Dengpeng [1 ,3 ]
Xu, Bo [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
关键词
Multi-agent reinforcement learning; contrastive representation learning; consensus; multi-agent cooperation; cognitive consistency; COMMUNICATION; SYSTEMS;
D O I
10.1007/s11633-023-1464-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Successful coordination in multi-agent systems requires agents to achieve consensus. Previous works propose methods through information sharing, such as explicit information sharing via communication protocols or exchanging information implicitly via behavior prediction. However, these methods may fail in the absence of communication channels or due to biased modeling. In this work, we propose to develop dual-channel consensus (DuCC) via contrastive representation learning for fully cooperative multi-agent systems, which does not need explicit communication and avoids biased modeling. DuCC comprises two types of consensus: temporally extended consensus within each agent (inner-agent consensus) and mutual consensus across agents (inter-agent consensus). To achieve DuCC, we design two objectives to learn representations of slow environmental features for inner-agent consensus and to realize cognitive consistency as inter-agent consensus. Our DuCC is highly general and can be flexibly combined with various MARL algorithms. The extensive experiments on StarCraft multi-agent challenge and Google research football demonstrate that our method efficiently reaches consensus and performs superiorly to state-of-the-art MARL algorithms.
引用
收藏
页码:349 / 368
页数:20
相关论文
共 69 条
  • [1] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
  • [2] Chen T, 2020, 37 INT CONFMACH LEAR, DOI DOI 10.5555/3524938.3525087
  • [3] Chu T. S., 2020, P 8 INT C LEARNING R
  • [4] Das A, 2019, P MACHINE LEARNING R, V97
  • [5] de Witt C. A. S., 2019, P 33 INT C NEURAL IN
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] Doersch C, 2021, Arxiv, DOI [arXiv:1606.05908, 10.48550/arXiv.1606.05908, DOI 10.48550/ARXIV.1606.05908]
  • [8] Foerster JN, 2016, ADV NEUR IN, V29
  • [9] Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
  • [10] Framewise phoneme classification with bidirectional LSTM and other neural network architectures
    Graves, A
    Schmidhuber, J
    [J]. NEURAL NETWORKS, 2005, 18 (5-6) : 602 - 610