Unsupervised Representation Learning for Visual Robotics Grasping

被引：3

作者：

Wang, Shaochen ^{[1
]}

Zhou, Zhangli ^{[1
]}

Wang, Hao ^{[1
]}

Li, Zhijun ^{[1
]}

Kan, Zhen ^{[1
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China

来源：

2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICARM54641.2022.9959267

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Despite tremendous success achieved by deep learning in the field of robotic vision, it still requires massive amounts of manual annotations and expensive computational resources to train a high-performance grasping detection model. The difficulties (e.g., complicated object geometry, sensor noise) all pose challenges for grasping unknown objects. In this paper, self-supervised representation learning pre-training is investigated to tackle the issues like expensive data annotation and poor generalization to improve visual robotics grasping. The proposed framework has two primary characteristics: 1) Siamese networks integrated with metric learning capture commonalities between similar objects from unlabeled data in an unsupervised fashion. 2) A well-designed encoder-decoder architecture with skip-connections, fusing low-level contour information and high-level semantic information, enables a spatially precise and semantically rich representation. A key aspect of using self-supervised pre-training model is that it alleviates the burden on data annotation and accelerates model training. By fine-tuning on a small number of labeled data, our method improves the baseline which does not use deep representation learning by 9.5 points on the Cornell dataset. Our final grasping system is capable to grasp unseen objects in a variety of scenarios on a 7DoF Franka Emika Panda robot. A video is available at https://youtu.be/Xd0hhYD-IOE.

引用

页码：57 / 62

页数：6

共 19 条

[11] Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection
Levine, Sergey
Pastor, Peter
Krizhevsky, Alex
Ibarz, Julian
Quillen, Deirdre
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (4-5) : 421 - 436
[12] Learning robust, real-time, reactive robotic grasping
Morrison, Douglas
Corke, Peter
Leitner, Jurgen
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (2-3) : 183 - 201
[13] Radford A, 2016, Arxiv, DOI [arXiv:1511.06434, 10.48550/arXiv.1511.06434, DOI 10.48550/ARXIV.1511.06434]
[14] Radford Alec, 2019, OpenAI Blog, V1, P9
[15] Redmon J, 2015, IEEE INT CONF ROBOT, P1316, DOI 10.1109/ICRA.2015.7139361
[16] Fully Convolutional Networks for Semantic Segmentation
Shelhamer, Evan
Long, Jonathan
Darrell, Trevor
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) : 640 - 651
[17] Bi-Directional Domain Adaptation for Sim2Real Transfer of Embodied Navigation Agents
Truong, Joanne
Chernova, Sonia
Batra, Dhruv
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02): : 2634 - 2641
[18] Multi-Modal Transfer Learning for Grasping Transparent and Specular Objects
Weng, Thomas
Pallankize, Amith
Tang, Yimin
Kroemer, Oliver
Held, David
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (03) : 3796 - 3803
[19] Yun Jiang, 2011, 2011 IEEE International Conference on Robotics and Automation (ICRA 2011), P3304, DOI 10.1109/ICRA.2011.5980145

← 1 2 →