Efficient Communication via Self-Supervised Information Aggregation for Online and Offline Multiagent Reinforcement Learning

被引:0
|
作者
Guan, Cong [1 ,2 ]
Chen, Feng [1 ,2 ]
Yuan, Lei [3 ]
Zhang, Zongzhang [1 ,2 ]
Yu, Yang [3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[3] Polixir Technol, Nanjing 211106, Peoples R China
基金
美国国家科学基金会;
关键词
Benchmark testing; Reinforcement learning; Observability; Training; Learning (artificial intelligence); Decision making; Data mining; Cooperative multiagent reinforcement learning (MARL); multiagent communication; offline learning; representation learning;
D O I
10.1109/TNNLS.2024.3420791
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Utilizing messages from teammates can improve coordination in cooperative multiagent reinforcement learning (MARL). Previous works typically combine raw messages of teammates with local information as inputs for policy. However, neglecting message aggregation poses significant inefficiency for policy learning. Motivated by recent advances in representation learning, we argue that efficient message aggregation is essential for good coordination in cooperative MARL. In this article, we propose Multiagent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy. Specifically, we design a permutation-invariant message encoder to generate common information-aggregated representation from messages and optimize it via reconstructing and shooting future information in a self-supervised manner. Hence, each agent would utilize the most relevant parts of the aggregated representation for decision-making by a novel message extraction mechanism. Furthermore, considering the potential of offline learning for real-world applications, we build offline benchmarks for multiagent communication, which is the first as we know. Empirical results demonstrate the superiority of our method in both online and offline settings. We also release the built offline benchmarks in this article as a testbed for communication ability validation to facilitate further future research in this direction.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning Without Task-Specific Knowledge
    Lee, Sang-Hyun
    Seo, Seung-Woo
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4043 - 4050
  • [42] Deep Multiview Clustering via Iteratively Self-Supervised Universal and Specific Space Learning
    Zhang, Yue
    Huang, Qinjian
    Zhang, Bin
    He, Shengfeng
    Dan, Tingting
    Peng, Hong
    Cai, Hongmin
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (11) : 11734 - 11746
  • [43] Masked Modeling-Based Ultrasound Image Classification via Self-Supervised Learning
    Xu, Kele
    You, Kang
    Zhu, Boqing
    Feng, Ming
    Feng, Dawei
    Yang, Cheng
    IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2024, 5 : 226 - 237
  • [44] Facial Video-Based Remote Physiological Measurement via Self-Supervised Learning
    Yue, Zijie
    Shi, Miaojing
    Ding, Shuai
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13844 - 13859
  • [45] Self-Supervised Representation Learning for Videos by Segmenting via Sampling Rate Order Prediction
    Huang, Jing
    Huang, Yan
    Wang, Qicong
    Yang, Wenming
    Meng, Hongying
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3475 - 3489
  • [46] Self-Supervised Video Representation Learning via Capturing Semantic Changes Indicated by Saccades
    Lai, Qiuxia
    Zeng, Ailing
    Wang, Ye
    Cao, Lihong
    Li, Yu
    Xu, Qiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 6634 - 6645
  • [47] Discovering Lin-Kernighan-Helsgaun heuristic for routing optimization using self-supervised reinforcement learning
    Wang, Qi
    Zhang, Chengwei
    Tang, Chunlei
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
  • [48] Self-Supervised Learning via Domain Adaptive Adversarial Clustering for Cross-Domain Chiller Fault Diagnosis
    Han, Huazheng
    Gao, Xuejin
    Han, Huayun
    Gao, Huihui
    Qi, Yongsheng
    Jiang, Kexin
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [49] Exploring PolSAR Images Representation via Self-Supervised Learning and Its Application on Few-Shot Classification
    Zhang, Wu
    Pan, Zongxu
    Hu, Yuxin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [50] Communication-Efficient Federated Learning for Large-Scale Multiagent Systems in ISAC: Data Augmentation With Reinforcement Learning
    Ouyang, Wenjiang
    Liu, Qian
    Mu, Junsheng
    AI-Dulaimi, Anwer
    Jing, Xiaojun
    Liu, Qilie
    IEEE SYSTEMS JOURNAL, 2024, : 1893 - 1904