A Deep Reinforcement Learning Method based on Deterministic Policy Gradient for Multi-Agent Cooperative Competition

被引:0
作者
Zuo, Xuan [1 ]
Xue, Hui-Feng [2 ]
Wang, Xiao-Yin [2 ]
Du, Wan-Ru [2 ]
Tian, Tao [2 ]
Gao, Shan [1 ]
Zhang, Pu [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710072, Peoples R China
[2] China Aerosp Acad Syst Sci & Engn, Beijing 100048, Peoples R China
来源
CONTROL ENGINEERING AND APPLIED INFORMATICS | 2021年 / 23卷 / 03期
关键词
Machine learning; reinforcement learning; multi-agent; cooperative competition; artificial intelligence; GO; ALGORITHM; GAME;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning in multi-agent scenario is important for real-world applications but presents challenges beyond those seen in single agent settings. This paper proposes a method to train a team of multiple types of agents to cooperate against another team of agents. Furthermore, this paper studies how to train multiple types of agents to collaborate better on their team tasks, and analyses the influence of various factors on agents' policy. In the computer experiments, agents are divided into attacking agents and defending agents. The results show that attacking agents which play the roles of deceivers can attract most of defending agents and help the other attacking agents to reach their targets successfully. Choosing appropriate length of training could help agents learn better action policy. The experiments results reveal that the number of agents has an effect on the performance of our proposed method. Increasing the number of deceivers in attacking agents can significantly increase the mission success of attacking team, but the computational complexity will rise and more episodes are needed to train agents.
引用
收藏
页码:88 / 98
页数:11
相关论文
共 28 条
  • [1] Deep Reinforcement Learning A brief survey
    Arulkumaran, Kai
    Deisenroth, Marc Peter
    Brundage, Miles
    Bharath, Anil Anthony
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) : 26 - 38
  • [2] Busoniu L, 2010, STUD COMPUT INTELL, V310, P183
  • [3] Choi J., 2017, MULTIFOCUS ATTENTION
  • [4] AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
    Conitzer, Vincent
    Sandholm, Tuomas
    [J]. MACHINE LEARNING, 2007, 67 (1-2) : 23 - 43
  • [5] Foerster JN, 2016, ADV NEUR IN, V29
  • [6] Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
  • [7] An Introduction to Deep Reinforcement Learning
    Francois-Lavet, Vincent
    Henderson, Peter
    Islam, Riashat
    Bellemare, Marc G.
    Pineau, Joelle
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2018, 11 (3-4): : 219 - 354
  • [8] A survey and critique of multiagent deep reinforcement learning
    Hernandez-Leal, Pablo
    Kartal, Bilal
    Taylor, Matthew E.
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (06) : 750 - 797
  • [9] Nash Q-learning for general-sum stochastic games
    Hu, JL
    Wellman, MP
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1039 - 1069
  • [10] Jiang JC, 2018, ADV NEUR IN, V31