Multi-Agent Autonomous Battle Management Using Deep Neuroevolution

被引：3

作者：

Soleyman, Sean ^{[1
]}

Hung, Fan ^{[1
]}

Khosla, Deepak ^{[1
]}

Chen, Yang ^{[1
]}

Fadaie, Joshua G. ^{[2
]}

Naderi, Navid ^{[1
]}

机构：

[1] HRL Labs LLC, 3011 Malibu Canyon Rd, Malibu, CA 90265 USA

[2] Boeing Co, 325 James S McDonnell Blvd, Hazelwood, MO 63042 USA

来源：

UNMANNED SYSTEMS TECHNOLOGY XXIII | 2021年 / 11758卷

关键词：

multi-agent; neurevolution; autonomous vehicles; aircraft; battle management; reinforcement learning; task allocation; AFSIM; LEVEL;

D O I：

10.1117/12.2585530

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Next-generation autonomous vehicles will require a level of team coordination that cannot be achieved using traditional Artificial Intelligence (AI) planning algorithms or Machine Learning (ML) algorithms alone. We present a method for controlling teams of military aircraft in air battle applications by using a novel combination of deep neuroevolution with an allocation- based task assignment algorithm. We describe the neuroevolution techniques that enable a deep neural network to evolve an effective policy, including a novel mutation operator that enhances the stability of the evolution process. We also compare this new method to policy gradient Reinforcement Learning (RL) techniques that we have utilized in previous work, and explain why neuroevolution presents several benefits in this particular application domain. The key analytical result is that neuroevolution makes it easier to select long sequences of actions following a consistent pattern, such as a continuous turning maneuver that occurs frequently in air engagements. We additionally describe multiple ways in which this neuroevolution approach can be integrated with allocation algorithms such as the KuhnMunkres Hungarian algorithm. We explain why gradient-free methods are particularly amenable to this hybrid approach and open up exciting new algorithmic possibilities. Since neuroevolution requires thousands of training episodes, we also describe an asynchronous parallelization scheme that yields order of magnitude speedup by evaluating multiple individuals from the evolving population simultaneously. Our deep neuroevolution approach out-performs human-programmed AI opponents with a win rate greater than 80% in multi-agent Beyond Visual Range air engagement simulations developed using AFSIM.

引用

页数：12

共 20 条

[1] [Anonymous], 2002, AI TECHNIQUES GAME P
[2] [Anonymous], 2006, EVOLUTIONARY DYNAMIC, DOI DOI 10.2307/J.CTVJGHW98
[3] Baker B., 2020, Emergent tool use from multi-agent autocurricula
[4] Berner C., 2019, 191206680 ARXIV
[5] Clive P. D., 2015, INT C SCI COMP
[6] Geron A., 2017, HANDS MACHINE LEARNI
[7] Human-level performance in 3D multiplayer games with population-based reinforcement learning
Jaderberg, Max
Czarnecki, Wojciech M.
Dunning, Iain
Marris, Luke
Lever, Guy
Castaneda, Antonio Garcia
Beattie, Charles
Rabinowitz, Neil C.
Morcos, Ari S.
Ruderman, Avraham
Sonnerat, Nicolas
Green, Tim
Deason, Louise
Leibo, Joel Z.
Silver, David
Hassabis, Demis
Kavukcuoglu, Koray
Graepel, Thore
[J]. SCIENCE, 2019, 364 (6443) : 859 - +
[8] The Hungarian Method for the assignment problem
Kuhn, HW
[J]. NAVAL RESEARCH LOGISTICS, 2005, 52 (01) : 7 - 21
[9] Lapan, 2018, DEEP REINFORCEMENT L
[10] Matiisen T., 2018, COMPUTATIONAL NEUROS

← 1 2 →