Event-Triggered Communication Network With Limited-Bandwidth Constraint for Multi-Agent Reinforcement Learning

被引:31
作者
Hu, Guangzheng [1 ,2 ]
Zhu, Yuanheng [1 ,2 ]
Zhao, Dongbin [1 ,2 ]
Zhao, Mengchen [3 ]
Hao, Jianye [3 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[3] Huawei, Noahs Ark Lab, Beijing 100085, Peoples R China
基金
中国国家自然科学基金;
关键词
Bandwidth; Protocols; Reinforcement learning; Task analysis; Optimization; Communication networks; Multi-agent systems; Event trigger; limited bandwidth; multi-agent communication; multi-agent reinforcement learning (MARL); IMPROVING COORDINATION;
D O I
10.1109/TNNLS.2021.3121546
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Communicating agents with each other in a distributed manner and behaving as a group are essential in multi-agent reinforcement learning. However, real-world multi-agent systems suffer from restrictions on limited bandwidth communication. If the bandwidth is fully occupied, some agents are not able to send messages promptly to others, causing decision delay and impairing cooperative effects. Recent related work has started to address the problem but still fails in maximally reducing the consumption of communication resources. In this article, we propose an event-triggered communication network (ETCNet) to enhance communication efficiency in multi-agent systems by communicating only when necessary. For different task requirements, two paradigms of the ETCNet framework, event-triggered sending network (ETSNet) and event-triggered receiving network (ETRNet), are proposed for learning efficient sending and receiving protocols, respectively. Leveraging the information theory, the limited bandwidth is translated to the penalty threshold of an event-triggered strategy, which determines whether an agent at each step participates in communication or not. Then, the design of the event-triggered strategy is formulated as a constrained Markov decision problem and reinforcement learning finds the feasible and optimal communication protocol that satisfies the limited bandwidth constraint. Experiments on typical multi-agent tasks demonstrate that ETCNet outperforms other methods in reducing bandwidth occupancy and still preserves the cooperative performance of multi-agent systems at the most.
引用
收藏
页码:3966 / 3978
页数:13
相关论文
共 41 条
  • [1] Baumann D, 2018, IEEE DECIS CONTR P, P943, DOI 10.1109/CDC.2018.8619335
  • [2] DEEPCAS: A Deep Reinforcement Learning Algorithm for Control-Aware Scheduling
    Demirel, Burak
    Ramaswamy, Arunselvan
    Quevedo, Daniel E.
    Karl, Holger
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2018, 2 (04): : 737 - 742
  • [3] Foerster JN, 2016, ADV NEUR IN, V29
  • [4] Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
  • [5] Freeman, 2004, TELECOMMUNICATION SY
  • [6] Learning event-triggered control from data through joint optimization
    Funk, Niklas
    Baumann, Dominik
    Berenz, Vincent
    Trimpe, Sebastian
    [J]. IFAC JOURNAL OF SYSTEMS AND CONTROL, 2021, 16
  • [7] Learning to Cooperate via an Attention-Based Communication Neural Network in Decentralized Multi-Robot Exploration
    Geng, Mingyang
    Xu, Kele
    Zhou, Xing
    Ding, Bo
    Wang, Huaimin
    Zhang, Lei
    [J]. ENTROPY, 2019, 21 (03)
  • [8] THE PRINCIPLE OF MAXIMUM-ENTROPY
    GUIASU, S
    SHENITZER, A
    [J]. MATHEMATICAL INTELLIGENCER, 1985, 7 (01) : 42 - 48
  • [9] Hernandez-Leal P., 2017, ARXIV170709183
  • [10] Jiang JC, 2018, ADV NEUR IN, V31