SD-Pose: Structural Discrepancy Aware Category-Level 6D Object Pose Estimation

被引:4
作者
Li, Guowei [1 ,2 ]
Zhu, Dongchen [1 ,2 ]
Zhang, Guanghui [1 ]
Shi, Wenjun [1 ]
Zhang, Tianyu [1 ,2 ]
Zhang, Xiaolin [1 ,2 ,3 ,4 ,5 ]
Li, Jiamao [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, State Key Lab Transducer Technol, Bion Vis Syst Lab, Shanghai 200050, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Xiongan Inst Innovat, Xiongan 071700, Peoples R China
[4] Univ Sci & Technol China, Hefei 230027, Anhui, Peoples R China
[5] ShanghaiTech Univ, Shanghai 201210, Peoples R China
来源
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/WACV56688.2023.00564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Category-level 6D object pose estimation aims to predict the full pose and size information for previously unseen instances from known categories, which is an essential portion of robot grasping and augmented reality. However, the core challenge of this task still is the enormous shape variation within each category. With regard to the challenge, we propose a novel framework SD-Pose, which utilizes the instance-category structural discrepancy and the potential geometric-semantic association to enhance the exploration of the intra-class shape information. Specifically, an information exchange augmentation (IEA) module is introduced to supplement the instance-category structural information by their structural discrepancy, thus facilitating the enhanced geometric information to contain both the character of instance shape and the commonality of category structure. For complementing the deficiencies of structural information adaptively, a semantic dynamic fusion (SDF) module is further designed to fuse semantic and geometric features. Finally, the proposed SD-Pose framework equipped with the IEA and SDF modules hierarchically supplements instance-category structural information in a stacked manner and achieves state-of-the-art performance on the CAMERA25 and REAL275 datasets.
引用
收藏
页码:5674 / 5683
页数:10
相关论文
共 42 条
[1]  
Burdea Grigore C, 2003, Virtual reality technology, DOI 10.1162/105474603322955950
[2]  
Chen DS, 2020, PROC CVPR IEEE, P11970, DOI 10.1109/CVPR42600.2020.01199
[3]   Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3051-3060
[4]   Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection [J].
Chen, Hao ;
Li, Youfu ;
Su, Dan .
PATTERN RECOGNITION, 2019, 86 :376-385
[5]   SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation [J].
Chen, Kai ;
Dou, Qi .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :2753-2762
[6]   FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].
Chen, Wei ;
Jia, Xi ;
Chang, Hyung Jin ;
Duan, Jinming ;
Shen, Linlin ;
Leonardis, Ales .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590
[7]   G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features [J].
Chen, Wei ;
Jia, Xi ;
Chang, Hyung Jin ;
Duan, Jinming ;
Leonardis, Ales .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4232-4241
[8]  
Chen W, 2020, IEEE WINT CONF APPL, P2813, DOI [10.1109/WACV45572.2020.9093272, 10.1109/wacv45572.2020.9093272]
[9]  
Collet A, 2009, IEEE INT CONF ROBOT, P3534
[10]   RANDOM SAMPLE CONSENSUS - A PARADIGM FOR MODEL-FITTING WITH APPLICATIONS TO IMAGE-ANALYSIS AND AUTOMATED CARTOGRAPHY [J].
FISCHLER, MA ;
BOLLES, RC .
COMMUNICATIONS OF THE ACM, 1981, 24 (06) :381-395