Efficient Multi-agent Communication via Self-supervised Information Aggregation
被引:0
|
作者:
Guan, Cong
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Guan, Cong
[1
]
Chen, Feng
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Chen, Feng
[1
]
Yuan, Lei
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Polixir Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Yuan, Lei
[1
,2
]
Wang, Chenghe
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Wang, Chenghe
[1
]
Yin, Hao
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Yin, Hao
[1
]
Zhang, Zongzhang
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Zhang, Zongzhang
[1
]
Yu, Yang
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Polixir Technol, Nanjing, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
Yu, Yang
[1
,2
]
机构:
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Polixir Technol, Nanjing, Peoples R China
来源:
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022)
|
2022年
基金:
美国国家科学基金会;
关键词:
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Utilizing messages from teammates can improve coordination in cooperative Multi-agent Reinforcement Learning (MARL). To obtain meaningful information for decision-making, previous works typically combine raw messages generated by teammates with local information as inputs for policy. However, neglecting the aggregation of multiple messages poses great inefficiency for policy learning. Motivated by recent advances in representation learning, we argue that efficient message aggregation is essential for good coordination in MARL. In this paper, we propose Multi-Agent communication via Self-supervised Information Aggregation (MASIA), with which agents can aggregate the received messages into compact representations with high relevance to augment the local policy. Specifically, we design a permutation invariant message encoder to generate common information aggregated representation from raw messages and optimize it via reconstructing and shooting future information in a self-supervised manner. Each agent would utilize the most relevant parts of the aggregated representation for decision-making by a novel message extraction mechanism. Empirical results demonstrate that our method significantly outperforms strong baselines on multiple cooperative MARL tasks for various task settings.