GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence

被引:0
作者
Wang, Pengyuan [1 ]
Ikeda, Takuya [2 ]
Lee, Robert [2 ]
Nishiwaki, Koichi [2 ]
机构
[1] Tech Univ Munich, Munich, Germany
[2] Woven Toyota, Tokyo, Japan
来源
COMPUTER VISION - ECCV 2024, PT XXVII | 2025年 / 15085卷
关键词
D O I
10.1007/978-3-031-73383-3_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics. Recently, deep-learning-based approaches have made great progress, but are typically hindered by the need for large datasets of either pose-labelled real images or carefully tuned photorealistic simulators. This can be avoided by using only geometry inputs such as depth images to reduce the domain-gap but these approaches suffer from a lack of semantic information, which can be vital in the pose estimation problem. To resolve this conflict, we propose to utilize both geometric and semantic features obtained from a pre-trained foundation model. Our approach projects 2D semantic features into object models as 3D semantic point clouds. Based on the novel 3D representation, we further propose a self-supervision pipeline, and match the fused semantic point clouds against their synthetic rendered partial observations from synthetic object models. The learned knowledge from synthetic data generalizes to observations of unseen objects in the real scenes, without any fine-tuning. We demonstrate this with a rich evaluation on the NOCS, Wild6D and SUN RGB-D benchmarks, showing superior performance over geometric-only and semantic-only baselines with significantly fewer training objects.
引用
收藏
页码:108 / 126
页数:19
相关论文
共 66 条
[1]  
Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]
[2]   Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation [J].
Chen, Dengsheng ;
Li, Jun ;
Wang, Zheng ;
Xu, Kai .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11970-11979
[3]   SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation [J].
Chen, Kai ;
Dou, Qi .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :2753-2762
[4]   FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].
Chen, Wei ;
Jia, Xi ;
Chang, Hyung Jin ;
Duan, Jinming ;
Shen, Linlin ;
Leonardis, Ales .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590
[5]   Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [J].
Chen, Xu ;
Dong, Zijian ;
Song, Jie ;
Geiger, Andreas ;
Hilliges, Otmar .
COMPUTER VISION - ECCV 2020, PT XXVI, 2020, 12371 :139-156
[6]   PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors [J].
Deng, Haowen ;
Birdal, Tolga ;
Ilic, Slobodan .
COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :620-638
[7]   PPFNet: Global Context Aware Local Features for Robust 3D Point Matching [J].
Deng, Haowen ;
Birdal, Tolga ;
Ilie, Slobodan .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :195-205
[8]   GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting [J].
Di, Yan ;
Zhang, Ruida ;
Lou, Zhiqiang ;
Manhardt, Fabian ;
Ji, Xiangyang ;
Navab, Nassir ;
Tombari, Federico .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6771-6781
[9]  
Fan Z.., 2021, arXiv
[10]   RANDOM SAMPLE CONSENSUS - A PARADIGM FOR MODEL-FITTING WITH APPLICATIONS TO IMAGE-ANALYSIS AND AUTOMATED CARTOGRAPHY [J].
FISCHLER, MA ;
BOLLES, RC .
COMMUNICATIONS OF THE ACM, 1981, 24 (06) :381-395