GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

被引：263

作者：

Wang, Gu ^{[1
,2
]}

Manhardt, Fabian ^{[2
]}

Tombari, Federico ^{[2
,3
]}

Ji, Xiangyang ^{[1
]}

机构：

[1] Tsinghua Univ, BNRist, Beijing, Peoples R China

[2] Tech Univ Munich, Munich, Germany

[3] Google, Mountain View, CA 94043 USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

国家重点研发计划;

关键词：

D O I：

10.1109/CVPR46437.2021.01634

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

6D pose estimation from a single RGB image is a fundamental task in computer vision. The current top-performing deep learning-based methods rely on an indirect strategy, i.e., first establishing 2D-3D correspondences between the coordinates in the image plane and object coordinate system, and then applying a variant of the PnP/RANSAC algorithm. However, this two-stage pipeline is not end-toend trainable, thus is hard to be employed for many tasks requiring differentiable poses. On the other hand, methods based on direct regression are currently inferior to geometry-based methods. In this work, we perform an indepth investigation on both direct and indirect methods, and propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations. Extensive experiments show that our approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets. Code is available at https://git.io/GDR-Net.

引用

页码：16606 / 16616

页数：11

共 66 条

[1]

[Anonymous], 2019, ADV NEURAL INFORM PR, DOI DOI 10.1109/EMBC.2019.8856774

[2] Physicochemical Properties of Hyaluronic Acid-Based Lubricant Eye Drops [J].

Aragona, Pasquale ;

Simmons, Peter A. S. ;

Wang, Hongpeng ;

Wang, Tao .

TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2019, 8 (06)

[3] Monocular Differentiable Rendering for Self-supervised 3D Object Detection [J].

Beker, Deniz ;

Kato, Hiroharu ;

Morariu, Mihai Adrian ;

Ando, Takahiro ;

Matsuoka, Toru ;

Kehl, Wadim ;

Gaidon, Adrien .

COMPUTER VISION - ECCV 2020, PT XXI, 2020, 12366 :514-529

[4]

Bochkovskiy A., 2020, YOLOv4: Optimal Speed and Accuracy of Object Detection

[5] Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses [J].

Brachmann, Eric ;

Rother, Carsten .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4321-4330

[6] Learning Less is More-6D Camera Localization via 3D Surface Regression [J].

Brachmann, Eric ;

Rother, Carsten .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4654-4662

[7] DSAC - Differentiable RANSAC for Camera Localization [J].

Brachmann, Eric ;

Krull, Alexander ;

Nowozin, Sebastian ;

Shotton, Jamie ;

Michel, Frank ;

Gumhold, Stefan ;

Rother, Carsten .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2492-2500

[8] Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image [J].

Brachmann, Eric ;

Michel, Frank ;

Krull, Alexander ;

Yang, Michael Ying ;

Gumhold, Stefan ;

Rother, Carsten .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3364-3372

[9]

Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35

[10] End-to-End Learnable Geometric Vision by Backpropagating PnP Optimization [J].

Chen, Bo ;

Parra, Alvaro ;

Cao, Jiewei ;

Li, Nan ;

Chin, Tat-Jun .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8097-8106

← 1 2 3 4 5 6 7 →