Efficient Communication via Self-Supervised Information Aggregation for Online and Offline Multiagent Reinforcement Learning

被引：0

作者：

Guan, Cong ^{[1
,2
]}

Chen, Feng ^{[1
,2
]}

Yuan, Lei ^{[3
]}

Zhang, Zongzhang ^{[1
,2
]}

Yu, Yang ^{[3
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[2] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China

[3] Polixir Technol, Nanjing 211106, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

美国国家科学基金会;

关键词：

Benchmark testing; Reinforcement learning; Observability; Training; Learning (artificial intelligence); Decision making; Data mining; Cooperative multiagent reinforcement learning (MARL); multiagent communication; offline learning; representation learning;

D O I：

10.1109/TNNLS.2024.3420791

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Utilizing messages from teammates can improve coordination in cooperative multiagent reinforcement learning (MARL). Previous works typically combine raw messages of teammates with local information as inputs for policy. However, neglecting message aggregation poses significant inefficiency for policy learning. Motivated by recent advances in representation learning, we argue that efficient message aggregation is essential for good coordination in cooperative MARL. In this article, we propose Multiagent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy. Specifically, we design a permutation-invariant message encoder to generate common information-aggregated representation from messages and optimize it via reconstructing and shooting future information in a self-supervised manner. Hence, each agent would utilize the most relevant parts of the aggregated representation for decision-making by a novel message extraction mechanism. Furthermore, considering the potential of offline learning for real-world applications, we build offline benchmarks for multiagent communication, which is the first as we know. Empirical results demonstrate the superiority of our method in both online and offline settings. We also release the built offline benchmarks in this article as a testbed for communication ability validation to facilitate further future research in this direction.

引用

页数：13

共 50 条

[41] Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning Without Task-Specific Knowledge
Lee, Sang-Hyun
Seo, Seung-Woo
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4043 - 4050
[42] Deep Multiview Clustering via Iteratively Self-Supervised Universal and Specific Space Learning
Zhang, Yue
Huang, Qinjian
Zhang, Bin
He, Shengfeng
Dan, Tingting
Peng, Hong
Cai, Hongmin
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (11) : 11734 - 11746
[43] Masked Modeling-Based Ultrasound Image Classification via Self-Supervised Learning
Xu, Kele
You, Kang
Zhu, Boqing
Feng, Ming
Feng, Dawei
Yang, Cheng
IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2024, 5 : 226 - 237
[44] Facial Video-Based Remote Physiological Measurement via Self-Supervised Learning
Yue, Zijie
Shi, Miaojing
Ding, Shuai
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13844 - 13859
[45] Self-Supervised Representation Learning for Videos by Segmenting via Sampling Rate Order Prediction
Huang, Jing
Huang, Yan
Wang, Qicong
Yang, Wenming
Meng, Hongying
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3475 - 3489
[46] Self-Supervised Video Representation Learning via Capturing Semantic Changes Indicated by Saccades
Lai, Qiuxia
Zeng, Ailing
Wang, Ye
Cao, Lihong
Li, Yu
Xu, Qiang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 6634 - 6645
[47] Discovering Lin-Kernighan-Helsgaun heuristic for routing optimization using self-supervised reinforcement learning
Wang, Qi
Zhang, Chengwei
Tang, Chunlei
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
[48] Self-Supervised Learning via Domain Adaptive Adversarial Clustering for Cross-Domain Chiller Fault Diagnosis
Han, Huazheng
Gao, Xuejin
Han, Huayun
Gao, Huihui
Qi, Yongsheng
Jiang, Kexin
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
[49] Exploring PolSAR Images Representation via Self-Supervised Learning and Its Application on Few-Shot Classification
Zhang, Wu
Pan, Zongxu
Hu, Yuxin
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[50] Communication-Efficient Federated Learning for Large-Scale Multiagent Systems in ISAC: Data Augmentation With Reinforcement Learning
Ouyang, Wenjiang
Liu, Qian
Mu, Junsheng
AI-Dulaimi, Anwer
Jing, Xiaojun
Liu, Qilie
IEEE SYSTEMS JOURNAL, 2024, : 1893 - 1904

← 1 2 3 4 5 →