Learning Transferable Policies with Improved Graph Neural Networks on Serial Robotic Structure

被引:0
作者
Zhang, Fengyi [1 ,2 ]
Xiong, Fangzhou [1 ,2 ]
Yang, Xu [1 ,3 ]
Liu, Zhiyong [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Huizhou Adv Mfg Technol Res Ctr Co Ltd, Huizhou 516000, Peoples R China
[4] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Shanghai 200031, Peoples R China
来源
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III | 2019年 / 11955卷
关键词
Graph neural networks; Transferable policy; Serial robotic structure;
D O I
10.1007/978-3-030-36718-3_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robotic control via reinforcement learning (RL) has made significant advances. However, a serious weakness with this method is that RL models are prone to overfitting and have poor transfer performance. Transfer in reinforcement learning means that only a few samples are needed to train policy networks for new tasks. In this paper we investigate the problem of learning transferable policies for robots with serial structures, such as robotic arms, with the help of graph neural networks (GNN). The GNN was previously employed to incorporate explicitly the robot structure into the policy network, and thus make the policy easier to be generalized or transferred. Based on a kinematics analysis particularly on the serial robotic structure, in this paper we further improve the policy network by proposing a weighted information aggregation strategy. The experiment is conducted in a few-shot policy learning setting on a robotic arm. The experimental results show that the new aggregation strategy significantly improves the performance not only on the learning speed, but also on the policy accuracy.
引用
收藏
页码:115 / 126
页数:12
相关论文
共 28 条
[1]  
Ammar Haitham Bou, 2012, Adaptive and Learning Agents. International Workshop, ALA 2011 Held at AAMAS 2011. Revised Selected Papers, P21, DOI 10.1007/978-3-642-28499-1_2
[2]  
Ammar H.B., 2015, 29 AAAI C ART INT
[3]  
[Anonymous], 2018, ARXIV180601203
[4]  
[Anonymous], 2016, Network Science
[5]  
[Anonymous], 2017, ARXIV170505035
[6]  
[Anonymous], 2018, P INT C MACH LEARN
[7]  
[Anonymous], 2016, NEURAL INFORM PROCES
[8]  
Battaglia P. W., 2018, ARXIV 180601261
[9]  
Brockman Greg, 2016, arXiv
[10]  
Chang M. B., 2016, ARXIV161200341