Unsupervised Representation Learning for Visual Robotics Grasping

被引：3

作者：

Wang, Shaochen ^{[1
]}

Zhou, Zhangli ^{[1
]}

Wang, Hao ^{[1
]}

Li, Zhijun ^{[1
]}

Kan, Zhen ^{[1
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China

来源：

2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICARM54641.2022.9959267

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Despite tremendous success achieved by deep learning in the field of robotic vision, it still requires massive amounts of manual annotations and expensive computational resources to train a high-performance grasping detection model. The difficulties (e.g., complicated object geometry, sensor noise) all pose challenges for grasping unknown objects. In this paper, self-supervised representation learning pre-training is investigated to tackle the issues like expensive data annotation and poor generalization to improve visual robotics grasping. The proposed framework has two primary characteristics: 1) Siamese networks integrated with metric learning capture commonalities between similar objects from unlabeled data in an unsupervised fashion. 2) A well-designed encoder-decoder architecture with skip-connections, fusing low-level contour information and high-level semantic information, enables a spatially precise and semantically rich representation. A key aspect of using self-supervised pre-training model is that it alleviates the burden on data annotation and accelerates model training. By fine-tuning on a small number of labeled data, our method improves the baseline which does not use deep representation learning by 9.5 points on the Cornell dataset. Our final grasping system is capable to grasp unseen objects in a variety of scenarios on a 7DoF Franka Emika Panda robot. A video is available at https://youtu.be/Xd0hhYD-IOE.

引用

页码：57 / 62

页数：6

共 19 条

[1] Chen X., 2021, IEEE C COMPUT VIS PA, p15 750
[2] Dai D., 2018, 2018 21 INT C INTELL, P3819
[3] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[4] Grill J-B., 2020, arXiv
[5] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[6] Densely Connected Convolutional Networks
Huang, Gao
Liu, Zhuang
van der Maaten, Laurens
Weinberger, Kilian Q.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
[7] Ioffe S., 2015, P 32 INT C MACHINE L, P448
[8] Kingma DP, 2014, ADV NEUR IN, V27
[9] Kumra S, 2017, IEEE INT C INT ROBOT, P769, DOI 10.1109/IROS.2017.8202237
[10] Deep learning for detecting robotic grasps
Lenz, Ian
Lee, Honglak
Saxena, Ashutosh
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (4-5) : 705 - 724

← 1 2 →