StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning

被引:124
作者
Shao, Kun [1 ,2 ]
Zhu, Yuanheng [1 ,2 ]
Zha, Dongbin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 101408, Peoples R China
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2019年 / 3卷 / 01期
基金
中国国家自然科学基金;
关键词
Reinforcement learning; transfer learning; curriculum learning; neural network; game AI; GAME; GO;
D O I
10.1109/TETCI.2018.2823329
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-time strategy games have been an important field of game artificial intelligence in recent years. This paper presents a reinforcement learning and curriculum transfer learning method to control multiple units in StarCraft micromanagement. We define an efficient state representation, which breaks down the complexity caused by the large state space in the game environment. Then, a parameter sharing multi-agent gradientdescent Sarsa(lambda) algorithm is proposed to train the units. The learning policy is shared among our units to encourage cooperative behaviors. We use a neural network as a function approximator to estimate the action-value function, and propose a reward function to help units balance their move and attack. In addition, a transfer learning method is used to extend our model to more difficult scenarios, which accelerates the training process and improves the learning performance. In small-scale scenarios, our units successfully learn to combat and defeat the built-in AI with 100% win rates. In large-scale scenarios, the curriculum transfer learning method is used to progressively train a group of units, and it shows superior performance over some baseline methods in target scenarios. With reinforcement learning and curriculum transfer learning, our units are able to learn appropriate strategies in StarCraft micromanagement scenarios.
引用
收藏
页码:73 / 84
页数:12
相关论文
共 63 条
[1]  
[Anonymous], 2016, PROC INT C MACH LEAR
[2]  
[Anonymous], 2015, Nature, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]
[3]  
[Anonymous], 2015, ADV NEURAL INFORM PR
[4]  
Bengio Y., 2009, P 26 ANN INT C MACH, P41, DOI [DOI 10.1145/1553374.1553380.EVENT-PLACE, 10.1145/1553374.1553380, DOI 10.1145/1553374.15533802,5]
[5]  
Churchill D., 2012, P AAAI C ARTIFICIAL, P2
[6]   Multi-Task Curriculum Transfer Deep Learning of Clothing Attributes [J].
Dong, Qi ;
Gong, Shaogang ;
Zhu, Xiatian .
2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, :520-529
[7]  
Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
[8]  
Glorot X., 2011, P 14 INT C ART INT S, P315
[9]   Hybrid computing using a neural network with dynamic external memory [J].
Graves, Alex ;
Wayne, Greg ;
Eynolds, Malcolm R. ;
Harley, Tim ;
Danihelka, Ivo ;
Grabska-Barwinska, Agnieszka ;
Colmenarejo, Sergio Gomez ;
Grefenstette, Edward ;
Amalho, Tiago R. ;
Agapiou, John ;
Badia, Adria Puigdomenech ;
Hermann, Karl Moritz ;
Zwols, Yori ;
Strovski, Georg O. ;
Ain, Adam C. ;
King, Helen ;
Summerfield, Christopher ;
Lunsom, Phil B. ;
Kavukcuoglu, Koray ;
Hassabis, Demis .
NATURE, 2016, 538 (7626) :471-+
[10]  
Gu SX, 2016, PR MACH LEARN RES, V48