Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation

被引：123

作者：

Chen, Dengsheng ^{[1
]}

Li, Jun ^{[1
]}

Wang, Zheng ^{[3
]}

Xu, Kai ^{[1
,2
]}

机构：

[1] Natl Univ Def Technol, Changsha, Peoples R China

[2] SpeedBot Robot Ltd, Changsha, Peoples R China

[3] Taobao Com, Hangzhou, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

关键词：

D O I：

10.1109/CVPR42600.2020.01199

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a novel approach to category-level 6D object pose and size estimation. To tackle intra-class shape variations, we learn canonical shape space (CASS), a unified representation for a large variety of instances of a certain object category. In particular, CASS is modeled as the latent space of a deep generative model of canonical 3D shapes with normalized pose. We train a variational auto-encoder (VAE) for generating 3D point clouds in the canonical space from an RGBD image. The VAE is trained in a cross-category fashion, exploiting the publicly available large 3D shape repositories. Since the 3D point cloud is generated in normalized pose (with actual size), the encoder of the VAE learns view-factorized RGBD embedding. It maps an RGBD image in arbitrary view into a pose-independent 3D shape representation. Object pose is then estimated via contrasting it with a pose-dependent feature of the input RGBD extracted with a separate deep neural networks. We integrate the learning of CASS and pose and size estimation into an end-to-end trainable network, achieving the state-of-the-art performance.

引用

页码：11970 / 11979

页数：10

共 33 条

[1] Scan2CAD: Learning CAD Model Alignment in RGB-D Scans [J].

Avetisyan, Armen ;

Dahnert, Manuel ;

Dai, Angela ;

Savva, Manolis ;

Chang, Angel X. ;

Niessner, Matthias .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2609-2618

[2] HPatches: A benchmark and evaluation of handcrafted and learned local descriptors [J].

Balntas, Vassileios ;

Lenc, Karel ;

Vedaldi, Andrea ;

Mikolajczyk, Krystian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3852-3861

[3]

Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35

[4]

Chang AX, 2015, Technical Report Tech Report

[5]

Choi C, 2012, IEEE INT C INT ROBOT, P3342, DOI 10.1109/IROS.2012.6386067

[6]

Choi C, 2012, IEEE INT CONF ROBOT, P1724, DOI 10.1109/ICRA.2012.6225371

[7] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[8]

Do T.-T., 2018, ARXIV

[9]

Georgakis Georgios, 2018, ARXIV

[10] Learning a Predictable and Generative Vector Representation for Objects [J].

Girdhar, Rohit ;

Fouhey, David F. ;

Rodriguez, Mikel ;

Gupta, Abhinav .

COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :484-499

← 1 2 3 4 →