LA-Net: An End-to-End Category-Level Object Attitude Estimation Network Based on Multi-Scale Feature Fusion and an Attention Mechanism

被引：0

作者：

Wang, Jing ^{[1
]}

Liu, Guohan ^{[1
]}

Guo, Cheng ^{[1
]}

Ma, Qianglong ^{[1
]}

Song, Wanying ^{[1
]}

机构：

[1] Xian Univ Sci & Technol, Sch Commun & Informat Engn, Xian 710054, Peoples R China

来源：

ELECTRONICS | 2024年 / 13卷 / 14期

关键词：

category-level 6D pose estimation; attention mechanism; feature fusion; 3D graph convolution; POSE ESTIMATION;

D O I：

10.3390/electronics13142809

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In category-level object pose estimation tasks, determining how to mitigate intra-class shape variations and improve pose estimation accuracy for complex objects remains a challenging problem to solve. To address this issue, this paper proposes a new network architecture, LA-Net, to efficiently ascertain object poses from features. Firstly, we extend the 3D graph convolution network architecture by introducing the LS-Layer (Linear Connection Layer), which enables the network to acquire features from different layers and perform multi-scale feature fusion. Secondly, LA-Net employs a novel attention mechanism (PSA) and a Max-Pooling layer to extract local and global geometric information, which enhances the network's ability to perceive object poses. Finally, the proposed LA-Net recovers the rotation information of an object by decoupling the rotation mechanism. The experimental results show that LA-Net can has much better accuracy in object pose estimation compared to the baseline method (HS-Pose). Especially for objects with complex shapes, its performance is 8.2% better for the 10 degrees 5 cm metric and 5% better for the 10 degrees 2 cm metric.

引用

页数：19

共 38 条

[1] SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation [J].

Cai, Dingding ;

Heikkila, Janne ;

Rahtu, Esa .

2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, :536-546

[2] OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation [J].

Cai, Dingding ;

Heikkia, Janne ;

Rahtu, Esa .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6793-6803

[3]

Castro P, 2020, INT CONF ACOUST SPEE, P4147, DOI [10.1109/ICASSP40776.2020.9053627, 10.1109/icassp40776.2020.9053627]

[4] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation [J].

Chen, Hansheng ;

Wang, Pichao ;

Wang, Fan ;

Tian, Wei ;

Xiong, Lu ;

Li, Hao .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :2771-2780

[5] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].

Chen, Wei ;

Jia, Xi ;

Chang, Hyung Jin ;

Duan, Jinming ;

Shen, Linlin ;

Leonardis, Ales .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590

[6] GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting [J].

Di, Yan ;

Zhang, Ruida ;

Lou, Zhiqiang ;

Manhardt, Fabian ;

Ji, Xiangyang ;

Navab, Nassir ;

Tombari, Federico .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6771-6781

[7] Zero-Shot 3D Pose Estimation of Unseen Object by Two-step RGB-D Fusion [J].

Duan, Guifang ;

Cheng, Shuai ;

Liu, Zhenyu ;

Zheng, Yanglun ;

Su, Yunhai ;

Tan, Jianrong .

NEUROCOMPUTING, 2024, 597

[8]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[9] ShAPO: Implicit Representations for Multi-object Shape, Appearance, and Pose Optimization [J].

Irshad, Muhammad Zubair ;

Zakharov, Sergey ;

Ambrus, Rares ;

Kollar, Thomas ;

Kira, Zsolt ;

Gaidon, Adrien .

COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 :275-292

[10]

Kothari N, 2017, 2017 INDIAN CONTROL CONFERENCE (ICC), P424, DOI 10.1109/INDIANCC.2017.7846512

← 1 2 3 4 →