Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery

被引：181

作者：

Grinvald, Margarita ^{[1
]}

Furrer, Fadri ^{[1
]}

Novkovic, Tonci ^{[1
]}

Chung, Jen Jen ^{[1
]}

Cadena, Cesar ^{[1
]}

Siegwart, Roland ^{[1
]}

Nieto, Juan ^{[1
]}

机构：

[1] Swiss Fed Inst Technol, Autonomous Syst Lab, CH-8092 Zurich, Switzerland

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2019年 / 4卷 / 03期

基金：

瑞士国家科学基金会;

关键词：

RGB-D perception; object detection; segmentation and categorization; mapping;

D O I：

10.1109/LRA.2019.2923960

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

To autonomously navigate and plan interactions in real-world environments, robots require the ability to robustly perceive and map complex, unstructured surrounding scenes. Besides building an internal representation of the observed scene geometry, the key insight toward a truly functional understanding of the environment is the usage of higher level entities during mapping, such as individual object instances. This work presents an approach to incrementally build volumetric object-centric maps during online scanning with a localized RGB-D camera. First, a per-frame segmentation scheme combines an unsupervised geometric approach with instance-aware semantic predictions to detect both recognized scene elements as well as previously unseen objects. Next, a data association step tracks the predicted instances across the different frames. Finally, a map integration strategy fuses information about their 3D shape, location, and, if available, semantic class into a global volume. Evaluation on a publicly available dataset shows that the proposed approach for building instance-level semantic maps is competitive with state-of-theart methods, while additionally able to discover objects of unseen categories. The system is further evaluated within a real-world robotic mapping setup, for which qualitative results highlight the online nature of the method. Code is available at https://githuh.com/ ethz-asl/voxblox-plusplus.

引用

页码：3037 / 3044

页数：8

共 26 条

[1]

Furrer F, 2018, IEEE INT C INT ROBOT, P6835, DOI 10.1109/IROS.2018.8594391

[2]

He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[3] Learning to Segment Every Thing [J].

Hu, Ronghang ;

Dollar, Piotr ;

He, Kaiming ;

Darrell, Trevor ;

Girshick, Ross .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4233-4241

[4] SceneNN: a Scene Meshes Dataset with aNNotations [J].

Hua, Binh-Son ;

Quang-Hieu Pham ;

Duc Thanh Nguyen ;

Minh-Khoi Tran ;

Yu, Lap-Fai ;

Yeung, Sai-Kit .

PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, :92-101

[5]

Joseph RK, 2016, CRIT POL ECON S ASIA, P1

[6]

Kellert M., 2013, 2013 Conference on Lasers & Electro-Optics. Europe & International Quantum Electronics Conference (CLEO EUROPE/IQEC), DOI 10.1109/CLEOE-IQEC.2013.6800663

[7] Microsoft COCO: Common Objects in Context [J].

Lin, Tsung-Yi ;

Maire, Michael ;

Belongie, Serge ;

Hays, James ;

Perona, Pietro ;

Ramanan, Deva ;

Dollar, Piotr ;

Zitnick, C. Lawrence .

COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755

[8]

McCormac John, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P4628, DOI 10.1109/ICRA.2017.7989538

[9] Fusion plus plus : Volumetric Object-Level SLAM [J].

McCormac, John ;

Clark, Ronald ;

Bloesch, Michael ;

Davison, Andrew J. ;

Leutenegger, Stefan .

2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, :32-41

[10]

Nakajima Y, 2018, IEEE INT C INT ROBOT, P385, DOI 10.1109/IROS.2018.8593993

← 1 2 3 →