Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation

被引：495

作者：

Wang, He ^{[1
]}

Sridhar, Srinath ^{[1
]}

Huang, Jingwei ^{[1
]}

Valentin, Julien ^{[2
]}

Song, Shuran ^{[3
]}

Guibas, Leonidas J. ^{[1
,4
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] Google Inc, Mountain View, CA USA

[3] Princeton Univ, Princeton, NJ 08544 USA

[4] Facebook AI Res, Sunnyvale, CA USA

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00275

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The goal of this paper is to estimate the 6D pose and dimensions of unseen object instances in an RGB-D image. Contrary to "instance-level'' 6D pose estimation tasks, our problem assumes that no exact object CAD models are available during either training or testing time. To handle different and unseen object instances in a given category, we introduce a Normalized Object Coordinate Space (NOCS)-a shared canonical representation for all possible object instances within a category. Our region-based neural network is then trained to directly infer the correspondence from observed pixels to this shared object representation (NOCS) along with other object information such as class label and instance mask. These predictions can be combined with the depth map to jointly estimate the metric 6D pose and dimensions of multiple objects in a cluttered scene. To train our network, we present a new context-aware technique to generate large amounts of fully annotated mixed reality data. To further improve our model and evaluate its performance on real data, we also provide a fully annotated real-world dataset with large environment and instance variation. Extensive experiments demonstrate that the proposed method is able to robustly estimate the pose and size of unseen object instances in real environments while also achieving state-of-the-art performance on standard 6D pose estimation benchmarks.

引用

页码：2637 / 2646

页数：10

共 51 条

[1]

[Anonymous], ECCV

[2]

[Anonymous], 2018, ARXIV180400175

[3]

Besl PJ, 1992, IEEE Trans. Pattern Anal. Mach. Intel, V14, P239, DOI [DOI 10.1109/34.121791, 10.1109/34.121791]

[4] Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image [J].

Brachmann, Eric ;

Michel, Frank ;

Krull, Alexander ;

Yang, Michael Ying ;

Gumhold, Stefan ;

Rother, Carsten .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3364-3372

[5]

Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35

[6]

Braun M, 2016, 2016 IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), P1546, DOI 10.1109/ITSC.2016.7795763

[7]

Chang Angel X., 2015, arXiv

[8]

Chen X., 2017, CVPR, DOI [DOI 10.1109/CVPR.2017.691, 10.1109/CVPR.2017.691]

[9] Monocular 3D Object Detection for Autonomous Driving [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhang, Ziyu ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156

[10] The MOPED framework: Object recognition and pose estimation for manipulation [J].

Collet, Alvaro ;

Martinez, Manuel ;

Srinivasa, Siddhartha S. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (10) :1284-1306

← 1 2 3 4 5 6 →