Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks

被引:0
作者
Wu, Binghong [1 ,2 ]
Hao, Kuangrong [1 ,2 ]
Cai, Xin [1 ,2 ]
Tang, Xuesong [1 ,2 ]
Wang, Tong [1 ,2 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
[2] Donghua Univ, Engn Res Ctr Digitized Text & Fash Technol, Minist Educ, Shanghai 201620, Peoples R China
来源
2018 CHINESE AUTOMATION CONGRESS (CAC) | 2018年
关键词
policy gradient reinforcement learning; meta-learning; effective policy adjustment; complex robot manipulation tasks;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The ability of adjusting policy is the key to learning decision making when completing complex manipulation tasks for agents. To solve this problem with the consideration of both exploration and exploitation, we propose a novel deep reinforcement learning algorithm by combining the Hindsight Experience Replay (HER) with the Model-Agnostic Meta-Learning (MAML). To solve the complex manipulation tasks, HER could provide a relatively effective exploration by converting the single-goal task to the multiple goals in such an environment where rewards are sparse and binary, enhancing the ability to search better policies according to not only the successful the transition trajectories but also the failures, and the MAML could promote the ability of exploitation, which means the proposed algorithm could learn faster and adjust the policy model from limited experience within few iterations. Plenty of simulation results on the complex tasks of manipulating objects with a robotic arm have been done, and results show that HER integrated with MAML could accelerate fine-tuning for the original policy gradient reinforcement learning with neural network policy, and also improve the performance on the success rate.
引用
收藏
页码:41 / 46
页数:6
相关论文
共 17 条
[1]  
Al-Shedivat M., 2017, INT C REPR LEARN
[2]  
Andrychowicz M., 2017, ADV NEURAL INF PROCE
[3]  
[Anonymous], 2016, ARXIV160604474
[4]  
[Anonymous], 2015, J MACHINE LEARNING R
[5]  
[Anonymous], 2013, Playing atari with deep reinforcement learning
[6]  
[Anonymous], 2017, PROC 34 INT C MACHIN
[7]  
[Anonymous], 2015, RXIV150902971
[8]  
Barto A., 1998, Reinforcement Learning: an Introduction
[9]  
Bengio S., 1995, PREPR C OPT ART BIOL, V2
[10]  
Kingma Diederik P, 2015, INT C REPR LEARN