Efficient Communication via Self-Supervised Information Aggregation for Online and Offline Multiagent Reinforcement Learning

被引：0

作者：

Guan, Cong ^{[1
,2
]}

Chen, Feng ^{[1
,2
]}

Yuan, Lei ^{[3
]}

Zhang, Zongzhang ^{[1
,2
]}

Yu, Yang ^{[3
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[2] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China

[3] Polixir Technol, Nanjing 211106, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

美国国家科学基金会;

关键词：

Benchmark testing; Reinforcement learning; Observability; Training; Learning (artificial intelligence); Decision making; Data mining; Cooperative multiagent reinforcement learning (MARL); multiagent communication; offline learning; representation learning;

D O I：

10.1109/TNNLS.2024.3420791

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Utilizing messages from teammates can improve coordination in cooperative multiagent reinforcement learning (MARL). Previous works typically combine raw messages of teammates with local information as inputs for policy. However, neglecting message aggregation poses significant inefficiency for policy learning. Motivated by recent advances in representation learning, we argue that efficient message aggregation is essential for good coordination in cooperative MARL. In this article, we propose Multiagent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy. Specifically, we design a permutation-invariant message encoder to generate common information-aggregated representation from messages and optimize it via reconstructing and shooting future information in a self-supervised manner. Hence, each agent would utilize the most relevant parts of the aggregated representation for decision-making by a novel message extraction mechanism. Furthermore, considering the potential of offline learning for real-world applications, we build offline benchmarks for multiagent communication, which is the first as we know. Empirical results demonstrate the superiority of our method in both online and offline settings. We also release the built offline benchmarks in this article as a testbed for communication ability validation to facilitate further future research in this direction.

引用

页数：13

共 50 条

[31] Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation
Zhang, Wei
Lin, Yuanguo
Liu, Yong
You, Huanyu
Wu, Pengcheng
Lin, Fan
Zhou, Xiuze
APPLIED SOFT COMPUTING, 2022, 131
[32] Knowledge-aware reasoning with self-supervised reinforcement learning for explainable recommendation in MOOCs
Lin, Yuanguo
Zhang, Wei
Lin, Fan
Zeng, Wenhua
Zhou, Xiuze
Wu, Pengcheng
NEURAL COMPUTING & APPLICATIONS, 2024, 36 (08): : 4115 - 4132
[33] Knowledge-aware reasoning with self-supervised reinforcement learning for explainable recommendation in MOOCs
Yuanguo Lin
Wei Zhang
Fan Lin
Wenhua Zeng
Xiuze Zhou
Pengcheng Wu
Neural Computing and Applications, 2024, 36 : 4115 - 4132
[34] RL-CSL: A Combinatorial Optimization Method Using Reinforcement Learning and Contrastive Self-Supervised Learning
Yuan, Zhongju
Li, Genghui
Wang, Zhenkun
Sun, Jianyong
Cheng, Ran
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (04): : 1010 - 1024
[35] Online Reinforcement Learning for Self-adaptive Information Systems
Palm, Alexander
Metzger, Andreas
Pohl, Klaus
ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2020, 2020, 12127 : 169 - 184
[36] Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification
Taleb, Aiham
Rohrer, Csaba
Bergner, Benjamin
De Leon, Guilherme
Rodrigues, Jonas Almeida
Schwendicke, Falk
Lippert, Christoph
Krois, Joachim
DIAGNOSTICS, 2022, 12 (05)
[37] Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer
Zhang, Wenrui
Yang, Ling
Geng, Shijia
Hong, Shenda
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16129 - 16138
[38] Self-Supervised Lie Algebra Representation Learning via Optimal Canonical Metric
Yu, Xiaohan
Pan, Zicheng
Zhao, Yang
Gao, Yongsheng
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 3547 - 3558
[39] ROBUST SELF-SUPERVISED SPEAKER REPRESENTATION LEARNING VIA INSTANCE MIX REGULARIZATION
Kang, Woo Hyun
Alam, Jahangir
Fathan, Abderrahim
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6617 - 6621
[40] Self-Supervised Blind Image Deconvolution via Deep Generative Ensemble Learning
Chen, Mingqin
Quan, Yuhui
Xu, Yong
Ji, Hui
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 634 - 647

← 1 2 3 4 5 →