A leader-following paradigm based deep reinforcement learning method for multi-agent cooperation games

被引:10
|
作者
Zhang, Feiye [1 ]
Yang, Qingyu [1 ,2 ]
An, Dou [1 ,2 ]
机构
[1] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, 28, West Xianning Rd, Xian 710049, Shaanxi, Peoples R China
[2] Xi An Jiao Tong Univ, MOE Key Lab Intelligent Networks & Network Secur, 28, West Xianning Rd, Xian 710049, Shaanxi, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Multi-agent systems; Deep reinforcement learning; Centralized training with decentralized; execution; Cooperative games; LEVEL;
D O I
10.1016/j.neunet.2022.09.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent deep reinforcement learning algorithms with centralized training with decentralized execution (CTDE) paradigm has attracted growing attention in both industry and research community. However, the existing CTDE methods follow the action selection paradigm that all agents choose actions at the same time, which ignores the heterogeneous roles of different agents. Motivated by the human wisdom in cooperative behaviors, we present a novel leader-following paradigm based deep multi-agent cooperation method (LFMCO) for multi-agent cooperative games. Specifically, we define a leader as someone who broadcasts a message representing the selected action to all subordinates. After that, the followers choose their individual action based on the received message from the leader. To measure the influence of leader's action on followers, we introduced a concept of information gain, i.e., the change of followers' value function entropy, which is positively correlated with the influence of leader's action. We evaluate the LFMCO on several cooperation scenarios of StarCraft2. Simulation results confirm the significant performance improvements of LFMCO compared with four state-of-the-art benchmarks on the challenging cooperative environment.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [31] Leader-following consensus for linear multi-agent systems with measurement noises
    Chen, Kairui
    Yan, Chuance
    Zhu, Zhangmou
    Li, Ping
    Ren, Qijun
    Wang, Junwei
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4512 - 4516
  • [32] Leader-following consensus protocols for formation control of multi-agent network
    Luo, Xiaoyuan
    Han, Nani
    Guan, Xinping
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2011, 22 (06) : 991 - 997
  • [33] A leader-following consensus problem of multi-agent systems in heterogeneous networks
    Cruz-Ancona, Christopher D.
    Martinez-Guerra, Rafael
    Perez-Pinacho, Claudia A.
    AUTOMATICA, 2020, 115
  • [34] Topology Switching for Optimal Leader-Following Consensus of Multi-Agent Systems
    Fan, Tianpeng
    Wan, Quan
    Ding, Zhengtao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (08) : 3845 - 3849
  • [35] A leader-following rendezvous problem of double integrator multi-agent systems
    Dong, Yi
    Huang, Jie
    AUTOMATICA, 2013, 49 (05) : 1386 - 1391
  • [36] Consensus of a leader-following multi-agent system with negative weights and noises
    Hu, Ai-Hua
    Cao, Jin-De
    Hu, Man-Feng
    Guo, Liu-Xiao
    IET CONTROL THEORY AND APPLICATIONS, 2014, 8 (02): : 114 - 119
  • [37] Iterative Learning Control for Leader-following Consensus of Nonlinear Multi-agent Systems with Packet Dropout
    Deng, Xiongfeng
    Sun, Xiuxia
    Liu, Shuguang
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2019, 17 (08) : 2135 - 2144
  • [38] Iterative Learning Control for Leader-following Consensus of Nonlinear Multi-agent Systems with Packet Dropout
    Xiongfeng Deng
    Xiuxia Sun
    Shuguang Liu
    International Journal of Control, Automation and Systems, 2019, 17 : 2135 - 2144
  • [39] Leader-Following Rendezvous with Connectivity Preservation of a Class of Multi-agent Systems
    Dong, Yi
    Huang, Jie
    PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 6477 - 6482
  • [40] Leader-following consensus of multi-agent systems with delayed impulsive control
    Liu, Jia
    Guo, Liuxiao
    Hu, Manfeng
    Xu, Zhenyuan
    Yang, Yongqing
    IMA JOURNAL OF MATHEMATICAL CONTROL AND INFORMATION, 2016, 33 (01) : 137 - 146