Enhancing collaboration in multi-agent reinforcement learning with correlated trajectories

被引：0

作者：

Wang, Siying ^{[1
]}

Du, Hongfei ^{[2
]}

Zhou, Yang ^{[2
]}

Zhao, Zhitong ^{[2
]}

Zhang, Ruoning ^{[2
]}

Chen, Wenyu ^{[2
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 305卷

基金：

中国国家自然科学基金;

关键词：

Multi-agent systems; Deep reinforcement learning; Graph neural network; Pearson correlation coefficient; TRAFFIC LIGHT CONTROL;

D O I：

10.1016/j.knosys.2024.112665

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Collaborative behaviors inhuman social activities can be modeled with multi-agent reinforcement learning and used to train the collaborative policies of agents to achieve efficient cooperation. In general, agents with similar behaviors have a certain behavioral common cognition and are more likely to understand the intentions of both parties then to form cooperative policies. Traditional approaches focus on the collaborative allocation process between agents, ignoring the effects of similar behaviors and common cognition characteristics in collaborative interactions. In order to better establish collaborative relationships between agents, we propose a novel multi-agent reinforcement learning collaborative algorithm based on the similarity of agents' behavioral features. In this model, the interactions of agents are established as a graph neural network. Specifically, the Pearson correlation coefficient is proposed to compute the similarity of the history trajectories of the agents as a means of determining their behavioral common cognition, which is used to establish the weights of the edges in the modeled graph neural network. In addition, we design a transformer-encoder structured state information complementation module to enhance the decision representation of the agents. The experimental results on Predator-Prey and StarCraft II show that the proposed method can effectively enhance the collaborative behaviors between agents and improve the training efficiency of collaborative models.

引用

页数：12

共 50 条

[31] Learning structured communication for multi-agent reinforcement learning
Sheng, Junjie
Wang, Xiangfeng
Jin, Bo
Yan, Junchi
Li, Wenhao
Chang, Tsung-Hui
Wang, Jun
Zha, Hongyuan
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2022, 36 (02)
[32] Learning structured communication for multi-agent reinforcement learning
Junjie Sheng
Xiangfeng Wang
Bo Jin
Junchi Yan
Wenhao Li
Tsung-Hui Chang
Jun Wang
Hongyuan Zha
Autonomous Agents and Multi-Agent Systems, 2022, 36
[33] Generalized learning automata for multi-agent reinforcement learning
De Hauwere, Yann-Michael
Vrancx, Peter
Nowe, Ann
AI COMMUNICATIONS, 2010, 23 (04) : 311 - 324
[34] Multi-agent reinforcement learning for character control
Li, Cheng
Fussell, Levi
Komura, Taku
VISUAL COMPUTER, 2021, 37 (12): : 3115 - 3123
[35] Parallel and distributed multi-agent reinforcement learning
Kaya, M
Arslan, A
PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, : 437 - 441
[36] Reinforcement learning of multi-agent communicative acts
Hoet S.
Sabouret N.
Revue d'Intelligence Artificielle, 2010, 24 (02) : 159 - 188
[37] Coding for Distributed Multi-Agent Reinforcement Learning
Wang, Baoqian
Xie, Junfei
Atanasov, Nikolay
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 10625 - 10631
[38] Multi-agent Reinforcement Learning in Network Management
Bagnasco, Ricardo
Serrat, Joan
SCALABILITY OF NETWORKS AND SERVICES, PROCEEDINGS, 2009, 5637 : 199 - 202
[39] Multi-agent Reinforcement Learning for Service Composition
Lei, Yu
Yu, Philip S.
PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2016), 2016, : 790 - 793
[40] Multi-agent reinforcement learning with adaptive mimetism
Yamaguchi, T
Miura, M
Yachida, M
ETFA '96 - 1996 IEEE CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, PROCEEDINGS, VOLS 1 AND 2, 1996, : 288 - 294

← 1 2 3 4 5 →