Aggregation Transfer Learning for Multi-Agent Reinforcement learning

被引：7

作者：

Xu, Dongsheng ^{[1
]}

Qiao, Peng ^{[1
]}

Dou, Yong ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China

来源：

2021 2ND INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2021) | 2021年

关键词：

MADDPG; transfer learning; GNN; reinforcement learning; LEVEL;

D O I：

10.1109/ICBASE53849.2021.00107

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-agent reinforcement learning is currently mainly used in many real-time strategy games. For example, StarCraft, UAV combat. Multi-agent reinforcement learning algorithms have attracted widespread attention. In large-scale multi-agent environment, there is still the problem of state space explosion. Especially in transfer training, since the network input size is fixed, the existing network structure is difficult to adapt to large-scale scenario transfer training. In this paper, we use aggregation transfer training for multi-agent combat problems from aerial unmanned aerial vehicle (UAV) combat scenarios to extend the small-scale learning to large-scale and complex scenarios. We combine the graph neural network (GNN) with the MADDPG algorithm to process the agent observation with aggregation function and take it as input. It starts training from a small-scale multi-UAV combat scenario, gradually increases the number of UAV. The experimental results indicate that MADDPG methods for multi-agent UAV combat problems trained via aggregation transfer learning are able to reach the target performance more quickly, provide superior performance, compared with ones trained without aggregation transfer learning. The versatility of the confrontation model has also been improved.

引用

页码：547 / 551

页数：5

共 25 条

[1]

Arnekvist I, 2019, IEEE INT CONF ROBOT, P36, DOI [10.1109/ICRA.2019.8793556, 10.1109/icra.2019.8793556]

[2]

Baker B., 2019, INT C LEARN REPR

[3]

Berner C., 2019, Dota 2 with large scale deep reinforcement learning

[4]

Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974

[5]

Hausman K., 2018, INT C LEARNING REPRE

[6]

Higgins Irina, 2017, PR MACH LEARN RES

[7]

Kingma DP, 2014, ADV NEUR IN, V27

[8]

Lowe R, 2017, ADV NEUR IN, V30

[9] Human-level control through deep reinforcement learning [J].

Mnih, Volodymyr ;

Kavukcuoglu, Koray ;

Silver, David ;

Rusu, Andrei A. ;

Veness, Joel ;

Bellemare, Marc G. ;

Graves, Alex ;

Riedmiller, Martin ;

Fidjeland, Andreas K. ;

Ostrovski, Georg ;

Petersen, Stig ;

Beattie, Charles ;

Sadik, Amir ;

Antonoglou, Ioannis ;

King, Helen ;

Kumaran, Dharshan ;

Wierstra, Daan ;

Legg, Shane ;

Hassabis, Demis .

NATURE, 2015, 518 (7540) :529-533

[10]

Palmer G, 2018, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), P443

← 1 2 3 →