Multi-Agent Autonomous Battle Management Using Deep Neuroevolution

被引:3
作者
Soleyman, Sean [1 ]
Hung, Fan [1 ]
Khosla, Deepak [1 ]
Chen, Yang [1 ]
Fadaie, Joshua G. [2 ]
Naderi, Navid [1 ]
机构
[1] HRL Labs LLC, 3011 Malibu Canyon Rd, Malibu, CA 90265 USA
[2] Boeing Co, 325 James S McDonnell Blvd, Hazelwood, MO 63042 USA
来源
UNMANNED SYSTEMS TECHNOLOGY XXIII | 2021年 / 11758卷
关键词
multi-agent; neurevolution; autonomous vehicles; aircraft; battle management; reinforcement learning; task allocation; AFSIM; LEVEL;
D O I
10.1117/12.2585530
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Next-generation autonomous vehicles will require a level of team coordination that cannot be achieved using traditional Artificial Intelligence (AI) planning algorithms or Machine Learning (ML) algorithms alone. We present a method for controlling teams of military aircraft in air battle applications by using a novel combination of deep neuroevolution with an allocation- based task assignment algorithm. We describe the neuroevolution techniques that enable a deep neural network to evolve an effective policy, including a novel mutation operator that enhances the stability of the evolution process. We also compare this new method to policy gradient Reinforcement Learning (RL) techniques that we have utilized in previous work, and explain why neuroevolution presents several benefits in this particular application domain. The key analytical result is that neuroevolution makes it easier to select long sequences of actions following a consistent pattern, such as a continuous turning maneuver that occurs frequently in air engagements. We additionally describe multiple ways in which this neuroevolution approach can be integrated with allocation algorithms such as the KuhnMunkres Hungarian algorithm. We explain why gradient-free methods are particularly amenable to this hybrid approach and open up exciting new algorithmic possibilities. Since neuroevolution requires thousands of training episodes, we also describe an asynchronous parallelization scheme that yields order of magnitude speedup by evaluating multiple individuals from the evolving population simultaneously. Our deep neuroevolution approach out-performs human-programmed AI opponents with a win rate greater than 80% in multi-agent Beyond Visual Range air engagement simulations developed using AFSIM.
引用
收藏
页数:12
相关论文
共 20 条
  • [1] [Anonymous], 2002, AI TECHNIQUES GAME P
  • [2] [Anonymous], 2006, EVOLUTIONARY DYNAMIC, DOI DOI 10.2307/J.CTVJGHW98
  • [3] Baker B., 2020, Emergent tool use from multi-agent autocurricula
  • [4] Berner C., 2019, 191206680 ARXIV
  • [5] Clive P. D., 2015, INT C SCI COMP
  • [6] Geron A., 2017, HANDS MACHINE LEARNI
  • [7] Human-level performance in 3D multiplayer games with population-based reinforcement learning
    Jaderberg, Max
    Czarnecki, Wojciech M.
    Dunning, Iain
    Marris, Luke
    Lever, Guy
    Castaneda, Antonio Garcia
    Beattie, Charles
    Rabinowitz, Neil C.
    Morcos, Ari S.
    Ruderman, Avraham
    Sonnerat, Nicolas
    Green, Tim
    Deason, Louise
    Leibo, Joel Z.
    Silver, David
    Hassabis, Demis
    Kavukcuoglu, Koray
    Graepel, Thore
    [J]. SCIENCE, 2019, 364 (6443) : 859 - +
  • [8] The Hungarian Method for the assignment problem
    Kuhn, HW
    [J]. NAVAL RESEARCH LOGISTICS, 2005, 52 (01) : 7 - 21
  • [9] Lapan, 2018, DEEP REINFORCEMENT L
  • [10] Matiisen T., 2018, COMPUTATIONAL NEUROS