Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

被引:0
作者
Fei Wang
Ben Cui
Yue Liu
Baiming Ren
机构
[1] Northeastern University,Faculty of Robot Science and Engineering
[2] Northeastern University,College of Information Science and Engineering, Northeastern University
来源
Journal of Intelligent & Robotic Systems | 2022年 / 106卷
关键词
Deep reinforcement learning; Robotic arm; Peg-in-hole; Demonstration information processing;
D O I
暂无
中图分类号
学科分类号
摘要
Deep reinforcement learning has been widely studied in many fields of robotics. However, the application of the algorithm is seriously restricted by its low convergence efficiency. Although demonstration information can effectively improve the convergence speed, relying too much on demonstration information will reduce the training effect in the real environment and make the convergence effect worse. In addition, historical information should also be considered, as it will affect the utilization efficiency of information and convergence effect of the algorithm. However, there are few studies on this part at present. This paper proposes an improved reinforcement learning algorithm, which introduces the demonstration information utilization mechanism and LSTM network based on the Proximal Policy Optimization algorithm(PPO). Demonstration information is introduced to provide a priori knowledge base for robots, and a utilization mechanism for demonstration information is established to balance the utilization of teaching information and interactive information. So that the data efficiency can be improved. In addition, we reconstruct the network structure in deep reinforcement learning to introduce historical information. Experimental results show that the method is feasible. Compared with the existing solutions, our method significantly improves the convergence effect of robot autonomous learning.
引用
收藏
相关论文
共 40 条
[1]  
Beltran-Hernandez CC(2020)Learning force control for contact-rich manipulation tasks with rigid position-controlled robots IEEE Robotics and Automation Letters 5 5709-5716
[2]  
Petit D(2018)Visual reinforcement learning with imagined goals Advances in Neural Information Processing Systems 31 9191-9200
[3]  
Ramirez-Alpizar IG(2020)Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards International Journal of Advanced Robotic Systems 17 1729881419898342-753
[4]  
Nishi T(2020)Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis Complex & Intelligent Systems 6 741-1667
[5]  
Kikuchi S(2018)Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks IEEE Transactions on Industrial Informatics 15 1658-1556
[6]  
Matsubara T(2019)Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning IEEE Robotics and Automation Letters 4 1549-200
[7]  
Harada K(2018)Robot cooperative behavior learning using single-shot learning from demonstration and parallel hidden markov models IEEE Robotics and Automation Letters 4 193-126
[8]  
Nair AV(2016)Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control Optimal Control Applications and Methods 37 108-83
[9]  
Pong V(2019)Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation Robotics and Autonomous Systems 112 72-undefined
[10]  
Dalal M(undefined)undefined undefined undefined undefined-undefined