GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting

被引：90

作者：

Di, Yan ^{[1
]}

Zhang, Ruida ^{[2
]}

Lou, Zhiqiang ^{[2
]}

Manhardt, Fabian ^{[3
]}

Ji, Xiangyang ^{[2
]}

Navab, Nassir ^{[1
]}

Tombari, Federico ^{[1
]}

机构：

[1] Tech Univ Munich, Munich, Germany

[2] Tsinghua Univ, Beijing, Peoples R China

[3] Google, Mountain View, CA 94043 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00666

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle a single or a handful of different objects, which limits their applications. To circumvent this problem, category-level object pose estimation has recently been revamped, which aims at predicting the 6D pose as well as the 3D metric size for previously unseen instances from a given set of object classes. This is, however, a much more challenging task due to severe intra-class shape variations. To address this issue, we propose GPV-Pose, a novel framework for robust category-level pose estimation, harnessing geometric insights to enhance the learning of category-level pose-sensitive features. First, we introduce a decoupled confidence-driven rotation representation, which allows geometry-aware recovery of the associated rotation matrix. Second, we propose a novel geometry-guided point-wise voting paradigm for robust retrieval of the 3D object bounding box. Finally, leveraging these different output streams, we can enforce several geometric consistency terms, further increasing performance, especially for non-symmetric categories. GPV-Pose produces superior results to state-of-the-art competitors on common public benchmarks, whilst almost achieving real-time inference speed at 20 FPS.

引用

页码：6771 / 6781

页数：11

共 56 条

[1]

[Anonymous], 2012, TPAMI

[2]

[Anonymous], 2020, IJCV, DOI DOI 10.1080/19415257.2018.1562957

[3]

[Anonymous], 2021, P IEEE CVF C COMP VI, DOI DOI 10.23919/ICEMS52562.2021.9634607

[4]

[Anonymous], 2011, 2011 IEEE INT C

[5]

[Anonymous], 2016, ECCV

[6]

Chen DS, 2020, PROC CVPR IEEE, P11970, DOI 10.1109/CVPR42600.2020.01199

[7]

Chen K., 2021, P IEEECVF INT C COMP, P2773

[8]

Chen Wang, 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA), P10059, DOI 10.1109/ICRA40945.2020.9196679

[9] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].

Chen, Wei ;

Jia, Xi ;

Chang, Hyung Jin ;

Duan, Jinming ;

Shen, Linlin ;

Leonardis, Ales .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590

[10] G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features [J].

Chen, Wei ;

Jia, Xi ;

Chang, Hyung Jin ;

Duan, Jinming ;

Leonardis, Ales .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4232-4241

← 1 2 3 4 5 6 →