Reinforcement Learning for Pick and Place Operations in Robotics: A Survey

被引:36
作者
Lobbezoo, Andrew [1 ]
Qian, Yanjun [1 ]
Kwon, Hyock-Ju [1 ]
机构
[1] Univ Waterloo, Dept Mech & Mechatron Engn, AI Mfg Lab, Waterloo, ON N2L 3G1, Canada
关键词
reinforcement learning; Markov decision process; policy optimization; robotic control; simulation environment; pose estimation; imitation learning;
D O I
10.3390/robotics10030105
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The field of robotics has been rapidly developing in recent years, and the work related to training robotic agents with reinforcement learning has been a major focus of research. This survey reviews the application of reinforcement learning for pick-and-place operations, a task that a logistics robot can be trained to complete without support from a robotics engineer. To introduce this topic, we first review the fundamentals of reinforcement learning and various methods of policy optimization, such as value iteration and policy search. Next, factors which have an impact on the pick-and-place task, such as reward shaping, imitation learning, pose estimation, and simulation environment are examined. Following the review of the fundamentals and key factors for reinforcement learning, we present an extensive review of all methods implemented by researchers in the field to date. The strengths and weaknesses of each method from literature are discussed, and details about the contribution of each manuscript to the field are reviewed. The concluding critical discussion of the available literature, and the summary of open problems indicates that experiment validation, model generalization, and grasp pose selection are topics that require additional research.
引用
收藏
页数:27
相关论文
共 79 条
[61]   Grasp quality measures: review and performance [J].
Roa, Maximo A. ;
Suarez, Raul .
AUTONOMOUS ROBOTS, 2015, 38 (01) :65-88
[62]  
Russell S.J., 2002, Artificial Intelligence: A Modern Approach (International Edition)"
[63]   An overview of 3D object grasp synthesis algorithms [J].
Sahbani, A. ;
El-Khoury, S. ;
Bidaud, P. .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2012, 60 (03) :326-336
[64]  
Schulman, 2017, ARXIV
[65]   Deep Reinforcement Learning using Genetic Algorithm for Parameter Optimization [J].
Sehgal, Adarsh ;
Hung Manh La ;
Louis, Sushil J. ;
Hai Nguyen .
2019 THIRD IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC 2019), 2019, :596-601
[66]  
Siciliano B., 2008, SPRINGER HDB ROBOTIC
[67]  
Sigaud O., 2013, Markov decision processes in artificial intelligence
[68]  
Silver D., 2016, P 31 INT C MACH LEAR, V32
[69]  
Stapelberg Belinda, 2020, SACJ, V32, P258, DOI 10.18489/sacj.v32i2.746
[70]  
Steele M., 2021, ARXIV210501757