Learning by Watching via Keypoint Extraction and Imitation Learning

被引:0
作者
Sun, Yin-Tung Albert [1 ]
Lin, Hsin-Chang [2 ]
Wu, Po-Yen [3 ]
Huang, Jung-Tang [3 ]
机构
[1] Natl Taipei Univ Technol, Grad Inst Mfg Technol, Taipei 10608, Taiwan
[2] Natl Taipei Univ Technol, Grad Inst Mech & Elect Engn, Taipei 10608, Taiwan
[3] Natl Taipei Univ Technol, Dept Mech Engn, Taipei 10608, Taiwan
关键词
imitation learning; reinforcement learning; keypoint detection; image transition;
D O I
10.3390/machines10111049
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, the use of reinforcement learning and imitation learning to complete robot control tasks have become more popular. Demonstration and learning by experts have always been the goal of researchers. However, the lack of action data has been a significant limitation to learning by human demonstration. We propose an architecture based on a new 3D keypoint tracking model and generative adversarial imitation learning to learn from expert demonstrations. We used 3D keypoint tracking to make up for the lack of action data in simple images and then used image-to-image conversion to convert human hand demonstrations into robot images, which enabled subsequent generative adversarial imitation learning to learn smoothly. The estimation time of the 3D keypoint tracking model and the calculation time of the subsequent optimization algorithm was 30 ms. The coordinate errors of the model projected to the real 3D key point under correct detection were all within 1.8 cm. The tracking of key points did not require any sensors on the body; the operator did not need vision-related knowledge to correct the accuracy of the camera. By merely setting up a generic depth camera to track the mapping changes of key points after behavior clone training, the robot could learn human tasks by watching, including picking and placing an object and pouring water. We used pybullet to build an experimental environment to confirm our concept of the simplest behavioral cloning imitation to attest the success of the learning. The effectiveness of the proposed method was accomplished by a satisfactory performance requiring a sample efficiency of 20 sets for pick and place and 30 sets for pouring water.
引用
收藏
页数:18
相关论文
共 28 条
[1]  
Ackerman E., 2021, Toyota research demonstrates ceiling-mounted home robot
[2]  
Bahl S., 2022, arXiv
[3]  
Bain M., 1999, Machine Intelligence 15, Intelligent Agents, St. Catherine's College, Oxford, July 1995, P103
[4]  
Fu H., 2022, P PRESENTED WORKSHOP
[5]  
Gubbi S, 2020, P A I C C AUT ROBOT, P368, DOI [10.1109/iccar49639.2020.9108072, 10.1109/ICCAR49639.2020.9108072]
[6]  
Hawke J, 2020, IEEE INT CONF ROBOT, P251, DOI [10.1109/ICRA40945.2020.9197408, 10.1109/icra40945.2020.9197408]
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]  
Ho J, 2016, ADV NEUR IN, V29
[9]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[10]  
Jang E, 2021, PR MACH LEARN RES, V164, P991