Efficient Communication via Self-Supervised Information Aggregation for Online and Offline Multiagent Reinforcement Learning

被引:0
|
作者
Guan, Cong [1 ,2 ]
Chen, Feng [1 ,2 ]
Yuan, Lei [3 ]
Zhang, Zongzhang [1 ,2 ]
Yu, Yang [3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[3] Polixir Technol, Nanjing 211106, Peoples R China
基金
美国国家科学基金会;
关键词
Benchmark testing; Reinforcement learning; Observability; Training; Learning (artificial intelligence); Decision making; Data mining; Cooperative multiagent reinforcement learning (MARL); multiagent communication; offline learning; representation learning;
D O I
10.1109/TNNLS.2024.3420791
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Utilizing messages from teammates can improve coordination in cooperative multiagent reinforcement learning (MARL). Previous works typically combine raw messages of teammates with local information as inputs for policy. However, neglecting message aggregation poses significant inefficiency for policy learning. Motivated by recent advances in representation learning, we argue that efficient message aggregation is essential for good coordination in cooperative MARL. In this article, we propose Multiagent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy. Specifically, we design a permutation-invariant message encoder to generate common information-aggregated representation from messages and optimize it via reconstructing and shooting future information in a self-supervised manner. Hence, each agent would utilize the most relevant parts of the aggregated representation for decision-making by a novel message extraction mechanism. Furthermore, considering the potential of offline learning for real-world applications, we build offline benchmarks for multiagent communication, which is the first as we know. Empirical results demonstrate the superiority of our method in both online and offline settings. We also release the built offline benchmarks in this article as a testbed for communication ability validation to facilitate further future research in this direction.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation
    Zhang, Wei
    Lin, Yuanguo
    Liu, Yong
    You, Huanyu
    Wu, Pengcheng
    Lin, Fan
    Zhou, Xiuze
    APPLIED SOFT COMPUTING, 2022, 131
  • [32] Knowledge-aware reasoning with self-supervised reinforcement learning for explainable recommendation in MOOCs
    Lin, Yuanguo
    Zhang, Wei
    Lin, Fan
    Zeng, Wenhua
    Zhou, Xiuze
    Wu, Pengcheng
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (08): : 4115 - 4132
  • [33] Knowledge-aware reasoning with self-supervised reinforcement learning for explainable recommendation in MOOCs
    Yuanguo Lin
    Wei Zhang
    Fan Lin
    Wenhua Zeng
    Xiuze Zhou
    Pengcheng Wu
    Neural Computing and Applications, 2024, 36 : 4115 - 4132
  • [34] RL-CSL: A Combinatorial Optimization Method Using Reinforcement Learning and Contrastive Self-Supervised Learning
    Yuan, Zhongju
    Li, Genghui
    Wang, Zhenkun
    Sun, Jianyong
    Cheng, Ran
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (04): : 1010 - 1024
  • [35] Online Reinforcement Learning for Self-adaptive Information Systems
    Palm, Alexander
    Metzger, Andreas
    Pohl, Klaus
    ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2020, 2020, 12127 : 169 - 184
  • [36] Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification
    Taleb, Aiham
    Rohrer, Csaba
    Bergner, Benjamin
    De Leon, Guilherme
    Rodrigues, Jonas Almeida
    Schwendicke, Falk
    Lippert, Christoph
    Krois, Joachim
    DIAGNOSTICS, 2022, 12 (05)
  • [37] Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer
    Zhang, Wenrui
    Yang, Ling
    Geng, Shijia
    Hong, Shenda
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16129 - 16138
  • [38] Self-Supervised Lie Algebra Representation Learning via Optimal Canonical Metric
    Yu, Xiaohan
    Pan, Zicheng
    Zhao, Yang
    Gao, Yongsheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 3547 - 3558
  • [39] ROBUST SELF-SUPERVISED SPEAKER REPRESENTATION LEARNING VIA INSTANCE MIX REGULARIZATION
    Kang, Woo Hyun
    Alam, Jahangir
    Fathan, Abderrahim
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6617 - 6621
  • [40] Self-Supervised Blind Image Deconvolution via Deep Generative Ensemble Learning
    Chen, Mingqin
    Quan, Yuhui
    Xu, Yong
    Ji, Hui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 634 - 647