RGB-D object detection and semantic segmentation for autonomous manipulation in clutter

被引:123
作者
Schwarz, Max [1 ]
Milan, Anton [2 ]
Periyasamy, Arul Selvam [1 ]
Behnke, Sven [1 ]
机构
[1] Univ Bonn, Bonn, Germany
[2] Univ Adelaide, Adelaide, SA, Australia
基金
欧盟地平线“2020”;
关键词
Deep learning; object perception; RGB-D camera; transfer learning; object detection; semantic segmentation;
D O I
10.1177/0278364917713117
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Autonomous robotic manipulation in clutter is challenging. A large variety of objects must be perceived in complex scenes, where they are partially occluded and embedded among many distractors, often in restricted spaces. To tackle these challenges, we developed a deep-learning approach that combines object detection and semantic segmentation. The manipulation scenes are captured with RGB-D cameras, for which we developed a depth fusion method. Employing pretrained features makes learning from small annotated robotic datasets possible. We evaluate our approach on two challenging datasets: one captured for the Amazon Picking Challenge 2016, where our team NimbRo came in second in the Stowing and third in the Picking task; and one captured in disaster-response scenarios. The experiments show that object detection and semantic segmentation complement each other and can be combined to yield reliable object perception.
引用
收藏
页码:437 / 451
页数:15
相关论文
共 54 条
  • [21] Behnke S., 2003, Lecture Notes in Computer Science, V2766
  • [22] Berner A, 2013, IEEE IMAGE PROC, P3326, DOI 10.1109/ICIP.2013.6738685
  • [23] Buchholz D, 2014, IEEE INT CONF ROBOT, P875, DOI 10.1109/ICRA.2014.6906957
  • [24] Correll Nikolaus., 2016, IEEE Trans. on Automation Science and Engineering
  • [25] Domae Y, 2014, IEEE INT CONF ROBOT, P1997, DOI 10.1109/ICRA.2014.6907124
  • [26] Model Globally, Match Locally: Efficient and Robust 3D Object Recognition
    Drost, Bertram
    Ulrich, Markus
    Navab, Nassir
    Ilic, Slobodan
    [J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 998 - 1005
  • [27] Image Guided Depth Upsampling using Anisotropic Total Generalized Variation
    Ferstl, David
    Reinbacher, Christian
    Ranftl, Rene
    Ruether, Matthias
    Bischof, Horst
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 993 - 1000
  • [28] Graves A, 2013, INT CONF ACOUST SPEE, P6645, DOI 10.1109/ICASSP.2013.6638947
  • [29] Cross Modal Distillation for Supervision Transfer
    Gupta, Saurabh
    Hoffman, Judy
    Malik, Jitendra
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2827 - 2836
  • [30] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778