Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image

被引：18

作者：

Fan, Zhaoxin ^{[1
]}

Song, Zhenbo ^{[2
]}

Xu, Jian ^{[4
]}

Wang, Zhicheng ^{[4
]}

Wu, Kejian ^{[4
]}

Liu, Hongyan ^{[3
]}

He, Jun ^{[1
]}

机构：

[1] Renmin Univ China, Sch Informat, Key Lab Data Engn & Knowledge Engn MOE, Beijing 100872, Peoples R China

[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[3] Tsinghua Univ, Dept Management Sci & Engn, Beijing 100084, Peoples R China

[4] Nreal, Beijing, Peoples R China

来源：

COMPUTER VISION - ECCV 2022, PT II | 2022年 / 13662卷

基金：

中国国家自然科学基金;

关键词：

Category-level 6D pose estimation; Object-level depth; Position hints; Decoupled depth reconstruction;

D O I：

10.1007/978-3-031-20086-1_13

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, RGBD-based category-level 6D object pose estimation has achieved promising improvement in performance, however, the requirement of depth information prohibits broader applications. In order to relieve this problem, this paper proposes a novel approach named Object Level Depth reconstruction Network (OLD-Net) taking only RGB images as input for category-level 6D object pose estimation. We propose to directly predict object-level depth from a monocular RGB image by deforming the category-level shape prior into object-level depth and the canonical NOCS representation. Two novel modules named Normalized Global Position Hints (NGPH) and Shape-aware Decoupled Depth Reconstruction (SDDR) module are introduced to learn high fidelity object-level depth and delicate shape representations. At last, the 6D object pose is solved by aligning the predicted canonical representation with the back-projected object-level depth. Extensive experiments on the challenging CAMERA25 and REAL275 datasets indicate that our model, though simple, achieves state-of-the-art performance.

引用

页码：220 / 236

页数：17

共 36 条

[1] nuScenes: A multimodal dataset for autonomous driving [J].

Caesar, Holger ;

Bankiti, Varun ;

Lang, Alex H. ;

Vora, Sourabh ;

Liong, Venice Erin ;

Xu, Qiang ;

Krishnan, Anush ;

Pan, Yu ;

Baldan, Giancarlo ;

Beijbom, Oscar .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628

[2] Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation [J].

Chen, Dengsheng ;

Li, Jun ;

Wang, Zheng ;

Xu, Kai .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11970-11979

[3] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].

Chen, Wei ;

Jia, Xi ;

Chang, Hyung Jin ;

Duan, Jinming ;

Shen, Linlin ;

Leonardis, Ales .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590

[4]

Du GG, 2020, Arxiv, DOI arXiv:1905.06658

[5]

Fan Z., 2021, arXiv

[6]

Fan ZX, 2022, Arxiv, DOI arXiv:2105.14291

[7] Mesh R-CNN [J].

Gkioxari, Georgia ;

Malik, Jitendra ;

Johnson, Justin .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9784-9794

[8] A survey of deep learning techniques for autonomous driving [J].

Grigorescu, Sorin ;

Trasnea, Bogdan ;

Cocias, Tiberiu ;

Macesanu, Gigel .

JOURNAL OF FIELD ROBOTICS, 2020, 37 (03) :362-386

[9]

He Yisheng, 2020, P IEEE CVF C COMP VI, P11632

[10]

Tan DJ, 2017, Arxiv, DOI arXiv:1709.01459

← 1 2 3 4 →