Unsupervised Representation Learning for Visual Robotics Grasping

被引:3
作者
Wang, Shaochen [1 ]
Zhou, Zhangli [1 ]
Wang, Hao [1 ]
Li, Zhijun [1 ]
Kan, Zhen [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China
来源
2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022) | 2022年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICARM54641.2022.9959267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite tremendous success achieved by deep learning in the field of robotic vision, it still requires massive amounts of manual annotations and expensive computational resources to train a high-performance grasping detection model. The difficulties (e.g., complicated object geometry, sensor noise) all pose challenges for grasping unknown objects. In this paper, self-supervised representation learning pre-training is investigated to tackle the issues like expensive data annotation and poor generalization to improve visual robotics grasping. The proposed framework has two primary characteristics: 1) Siamese networks integrated with metric learning capture commonalities between similar objects from unlabeled data in an unsupervised fashion. 2) A well-designed encoder-decoder architecture with skip-connections, fusing low-level contour information and high-level semantic information, enables a spatially precise and semantically rich representation. A key aspect of using self-supervised pre-training model is that it alleviates the burden on data annotation and accelerates model training. By fine-tuning on a small number of labeled data, our method improves the baseline which does not use deep representation learning by 9.5 points on the Cornell dataset. Our final grasping system is capable to grasp unseen objects in a variety of scenarios on a 7DoF Franka Emika Panda robot. A video is available at https://youtu.be/Xd0hhYD-IOE.
引用
收藏
页码:57 / 62
页数:6
相关论文
共 19 条
  • [1] Chen X., 2021, IEEE C COMPUT VIS PA, p15 750
  • [2] Dai D., 2018, 2018 21 INT C INTELL, P3819
  • [3] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [4] Grill J-B., 2020, arXiv
  • [5] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [6] Densely Connected Convolutional Networks
    Huang, Gao
    Liu, Zhuang
    van der Maaten, Laurens
    Weinberger, Kilian Q.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
  • [7] Ioffe S., 2015, P 32 INT C MACHINE L, P448
  • [8] Kingma DP, 2014, ADV NEUR IN, V27
  • [9] Kumra S, 2017, IEEE INT C INT ROBOT, P769, DOI 10.1109/IROS.2017.8202237
  • [10] Deep learning for detecting robotic grasps
    Lenz, Ian
    Lee, Honglak
    Saxena, Ashutosh
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (4-5) : 705 - 724