Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation

被引：0

作者：

Tang, Kaifeng ^{[1
,2
,3
]}

Xu, Chi ^{[1
,2
,3
]}

Chen, Ming ^{[1
,2
,3
]}

机构：

[1] China Univ Geosci, Sch Automat, Wuhan 430074, Peoples R China

[2] China Univ Geosci, Hubei Key Lab Adv Control & Intelligent Automat Co, Wuhan, Hubei, Peoples R China

[3] Minist Educ, Engn Res Ctr Intelligent Technol Geoexplorat, Wuhan, Hubei, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 83卷 / 17期

基金：

中国国家自然科学基金;

关键词：

Object pose estimation; Object shape estimation; Attention; RGB-D image; Robotic vision;

D O I：

10.1007/s11042-023-17626-6

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

RGB-D images contain color and geometric information which are complementary for object pose and shape estimation. Normally, dense-fusion scheme is used to fuse the features extracted from the RGB-D channels for pose estimation of instance-level objects. However, for category-level objects, the effectiveness of dense-fusion feature is unfortunately affected by the significant intra-class variations between color and geometry. To address this problem, we propose AttentionFusion, a bi-directional attention-based RGB-D fusion framework for category-level object pose and shape estimation. In this framework, the complex contextual relationship between the color and geometric features is effectively explored by bi-directional cross-attention mechanism on a global scale for feature fusion. Based on the fused feature, 6D pose of the category-level object instance is refined iteratively, and object shape is also estimated precisely. Experimental results show that, the proposed method can achieve state-of-the-art performance for object pose and shape estimation on REAL275 datasets.

引用

页码：53043 / 53063

页数：21

共 64 条

[1] Scan2CAD: Learning CAD Model Alignment in RGB-D Scans [J].

Avetisyan, Armen ;

Dahnert, Manuel ;

Dai, Angela ;

Savva, Manolis ;

Chang, Angel X. ;

Niessner, Matthias .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2609-2618

[2]

Brachmann E, 2014, LECT NOTES COMPUT SC, V8690, P536, DOI 10.1007/978-3-319-10605-2_35

[3] CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].

Chen, Chun-Fu ;

Fan, Quanfu ;

Panda, Rameswar .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356

[4] SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation [J].

Chen, Kai ;

Dou, Qi .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :2753-2762

[5]

Chen Wang, 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA), P10059, DOI 10.1109/ICRA40945.2020.9196679

[6] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].

Chen, Wei ;

Jia, Xi ;

Chang, Hyung Jin ;

Duan, Jinming ;

Shen, Linlin ;

Leonardis, Ales .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590

[7] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[8]

Dengsheng Chen, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P11970, DOI 10.1109/CVPR42600.2020.01199

[9] GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting [J].

Di, Yan ;

Zhang, Ruida ;

Lou, Zhiqiang ;

Manhardt, Fabian ;

Ji, Xiangyang ;

Navab, Nassir ;

Tombari, Federico .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6771-6781

[10]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

← 1 2 3 4 5 6 7 →