Unsupervised Representation Learning for Visual Robotics Grasping

被引:3
作者
Wang, Shaochen [1 ]
Zhou, Zhangli [1 ]
Wang, Hao [1 ]
Li, Zhijun [1 ]
Kan, Zhen [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China
来源
2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022) | 2022年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICARM54641.2022.9959267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite tremendous success achieved by deep learning in the field of robotic vision, it still requires massive amounts of manual annotations and expensive computational resources to train a high-performance grasping detection model. The difficulties (e.g., complicated object geometry, sensor noise) all pose challenges for grasping unknown objects. In this paper, self-supervised representation learning pre-training is investigated to tackle the issues like expensive data annotation and poor generalization to improve visual robotics grasping. The proposed framework has two primary characteristics: 1) Siamese networks integrated with metric learning capture commonalities between similar objects from unlabeled data in an unsupervised fashion. 2) A well-designed encoder-decoder architecture with skip-connections, fusing low-level contour information and high-level semantic information, enables a spatially precise and semantically rich representation. A key aspect of using self-supervised pre-training model is that it alleviates the burden on data annotation and accelerates model training. By fine-tuning on a small number of labeled data, our method improves the baseline which does not use deep representation learning by 9.5 points on the Cornell dataset. Our final grasping system is capable to grasp unseen objects in a variety of scenarios on a 7DoF Franka Emika Panda robot. A video is available at https://youtu.be/Xd0hhYD-IOE.
引用
收藏
页码:57 / 62
页数:6
相关论文
共 19 条
  • [11] Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection
    Levine, Sergey
    Pastor, Peter
    Krizhevsky, Alex
    Ibarz, Julian
    Quillen, Deirdre
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (4-5) : 421 - 436
  • [12] Learning robust, real-time, reactive robotic grasping
    Morrison, Douglas
    Corke, Peter
    Leitner, Jurgen
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (2-3) : 183 - 201
  • [13] Radford A, 2016, Arxiv, DOI [arXiv:1511.06434, 10.48550/arXiv.1511.06434, DOI 10.48550/ARXIV.1511.06434]
  • [14] Radford Alec, 2019, OpenAI Blog, V1, P9
  • [15] Redmon J, 2015, IEEE INT CONF ROBOT, P1316, DOI 10.1109/ICRA.2015.7139361
  • [16] Fully Convolutional Networks for Semantic Segmentation
    Shelhamer, Evan
    Long, Jonathan
    Darrell, Trevor
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) : 640 - 651
  • [17] Bi-Directional Domain Adaptation for Sim2Real Transfer of Embodied Navigation Agents
    Truong, Joanne
    Chernova, Sonia
    Batra, Dhruv
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02): : 2634 - 2641
  • [18] Multi-Modal Transfer Learning for Grasping Transparent and Specular Objects
    Weng, Thomas
    Pallankize, Amith
    Tang, Yimin
    Kroemer, Oliver
    Held, David
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (03) : 3796 - 3803
  • [19] Yun Jiang, 2011, 2011 IEEE International Conference on Robotics and Automation (ICRA 2011), P3304, DOI 10.1109/ICRA.2011.5980145