Learning Evasion Strategy in Pursuit-Evasion by Deep Q-network

被引:0
作者
Zhu, Jiagang [1 ,2 ]
Zou, Wei [1 ,3 ]
Zhu, Zheng [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] TianJin Intelligent Tech Inst CASIA Co Ltd, Tianjin, Peoples R China
来源
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2018年
基金
中国国家自然科学基金; 国家高技术研究发展计划(863计划);
关键词
GAME; GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an approach for learning the evasion strategy for the evader in pursuit-evasion against the pursuers with Deep Q-network (DQN). To give the immediate reward to the agent, we handcraft a reward function, which considers both the evader escaping from being surrounded by the pursuers and keeping distance from the pursuers. This is a combination of the artificial potential field method with deep reinforcement learning. Our learned evasion strategy is verified by a series of experiments in three different game scenarios. The training stability and the value function are analyzed respectively. The three learned agents are compared with a random agent and a repulsive agent. We show the effectiveness of our method.
引用
收藏
页码:67 / 72
页数:6
相关论文
共 28 条
[1]  
[Anonymous], CONTROL MARINE
[2]  
[Anonymous], 2016, P 2016 ANN IEEE SYST
[3]  
[Anonymous], 2015, ARXIV150906461
[4]  
[Anonymous], 2012, Technical report
[5]  
[Anonymous], 2015, ARXIV150901549
[6]  
[Anonymous], 2013, Playing atari with deep reinforcement learning
[7]  
[Anonymous], J ELECT ENG
[8]  
[Anonymous], 2015, ARXIV151103791
[9]  
[Anonymous], 2016, MULTIAGENT DEEP REIN
[10]  
[Anonymous], 2015, P INT C LEARN REPR I