Aggregation Transfer Learning for Multi-Agent Reinforcement learning

被引:6
作者
Xu, Dongsheng [1 ]
Qiao, Peng [1 ]
Dou, Yong [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
来源
2021 2ND INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2021) | 2021年
关键词
MADDPG; transfer learning; GNN; reinforcement learning; LEVEL;
D O I
10.1109/ICBASE53849.2021.00107
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning is currently mainly used in many real-time strategy games. For example, StarCraft, UAV combat. Multi-agent reinforcement learning algorithms have attracted widespread attention. In large-scale multi-agent environment, there is still the problem of state space explosion. Especially in transfer training, since the network input size is fixed, the existing network structure is difficult to adapt to large-scale scenario transfer training. In this paper, we use aggregation transfer training for multi-agent combat problems from aerial unmanned aerial vehicle (UAV) combat scenarios to extend the small-scale learning to large-scale and complex scenarios. We combine the graph neural network (GNN) with the MADDPG algorithm to process the agent observation with aggregation function and take it as input. It starts training from a small-scale multi-UAV combat scenario, gradually increases the number of UAV. The experimental results indicate that MADDPG methods for multi-agent UAV combat problems trained via aggregation transfer learning are able to reach the target performance more quickly, provide superior performance, compared with ones trained without aggregation transfer learning. The versatility of the confrontation model has also been improved.
引用
收藏
页码:547 / 551
页数:5
相关论文
共 25 条
[1]  
Arnekvist I, 2019, IEEE INT CONF ROBOT, P36, DOI [10.1109/ICRA.2019.8793556, 10.1109/icra.2019.8793556]
[2]  
Berner C., 2019, 191206680 ARXIV
[3]  
Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
[4]  
Hausman K., 2018, INT C LEARNING REPRE
[5]  
Higgins I, 2017, PR MACH LEARN RES, V70
[6]  
Kanitscheider I., 2019, INT C LEARN REPR
[7]  
Kingma DP, 2014, ADV NEUR IN, V27
[8]  
Lowe R, 2017, ADV NEUR IN, V30
[9]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533
[10]  
Palmer G, 2018, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), P443