Reinforcement Learning for Pick and Place Operations in Robotics: A Survey

被引:30
作者
Lobbezoo, Andrew [1 ]
Qian, Yanjun [1 ]
Kwon, Hyock-Ju [1 ]
机构
[1] Univ Waterloo, Dept Mech & Mechatron Engn, AI Mfg Lab, Waterloo, ON N2L 3G1, Canada
关键词
reinforcement learning; Markov decision process; policy optimization; robotic control; simulation environment; pose estimation; imitation learning;
D O I
10.3390/robotics10030105
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The field of robotics has been rapidly developing in recent years, and the work related to training robotic agents with reinforcement learning has been a major focus of research. This survey reviews the application of reinforcement learning for pick-and-place operations, a task that a logistics robot can be trained to complete without support from a robotics engineer. To introduce this topic, we first review the fundamentals of reinforcement learning and various methods of policy optimization, such as value iteration and policy search. Next, factors which have an impact on the pick-and-place task, such as reward shaping, imitation learning, pose estimation, and simulation environment are examined. Following the review of the fundamentals and key factors for reinforcement learning, we present an extensive review of all methods implemented by researchers in the field to date. The strengths and weaknesses of each method from literature are discussed, and details about the contribution of each manuscript to the field are reviewed. The concluding critical discussion of the available literature, and the summary of open problems indicates that experiment validation, model generalization, and grasp pose selection are topics that require additional research.
引用
收藏
页数:27
相关论文
共 79 条
[1]  
Abbeel P., 2004, P 21 INT C MACH LEAR, P1, DOI [10.1145/1015330.1015430, DOI 10.1145/1015330.1015430]
[2]  
Ajaykumar G., 2021, ARXIV210501757
[3]  
Al-Selwi H.F., 2021, P 2021 IEEE 17 INT C
[4]   Twin Delayed Hierarchical Actor-Critic [J].
Anca, Mihai ;
Studley, Matthew .
2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, :221-225
[5]  
[Anonymous], 2014, INTRO ROBOTICS
[6]  
[Anonymous], 1994, MACHINE LEARNING, DOI DOI 10.1016/B978-1-55860-335-6.50030-1
[7]  
Atkeson CG, 1997, IEEE INT CONF ROBOT, P3557, DOI 10.1109/ROBOT.1997.606886
[8]  
Beltran-Hernandez CC, 2019, IEEE/SICE I S SYS IN, P468, DOI [10.1109/sii.2019.8700399, 10.1109/SII.2019.8700399]
[9]  
Berscheid L, 2019, IEEE INT C INT ROBOT, P612, DOI [10.1109/iros40897.2019.8968042, 10.1109/IROS40897.2019.8968042]
[10]  
Biggs G., 2003, P AUSTR C ROB AUT, P27, DOI DOI 10.1109/ROBOT.2001.932554