LA-Net: An End-to-End Category-Level Object Attitude Estimation Network Based on Multi-Scale Feature Fusion and an Attention Mechanism

被引:0
作者
Wang, Jing [1 ]
Liu, Guohan [1 ]
Guo, Cheng [1 ]
Ma, Qianglong [1 ]
Song, Wanying [1 ]
机构
[1] Xian Univ Sci & Technol, Sch Commun & Informat Engn, Xian 710054, Peoples R China
关键词
category-level 6D pose estimation; attention mechanism; feature fusion; 3D graph convolution; POSE ESTIMATION;
D O I
10.3390/electronics13142809
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In category-level object pose estimation tasks, determining how to mitigate intra-class shape variations and improve pose estimation accuracy for complex objects remains a challenging problem to solve. To address this issue, this paper proposes a new network architecture, LA-Net, to efficiently ascertain object poses from features. Firstly, we extend the 3D graph convolution network architecture by introducing the LS-Layer (Linear Connection Layer), which enables the network to acquire features from different layers and perform multi-scale feature fusion. Secondly, LA-Net employs a novel attention mechanism (PSA) and a Max-Pooling layer to extract local and global geometric information, which enhances the network's ability to perceive object poses. Finally, the proposed LA-Net recovers the rotation information of an object by decoupling the rotation mechanism. The experimental results show that LA-Net can has much better accuracy in object pose estimation compared to the baseline method (HS-Pose). Especially for objects with complex shapes, its performance is 8.2% better for the 10 degrees 5 cm metric and 5% better for the 10 degrees 2 cm metric.
引用
收藏
页数:19
相关论文
共 38 条
[1]   SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation [J].
Cai, Dingding ;
Heikkila, Janne ;
Rahtu, Esa .
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, :536-546
[2]   OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation [J].
Cai, Dingding ;
Heikkia, Janne ;
Rahtu, Esa .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6793-6803
[3]  
Castro P, 2020, INT CONF ACOUST SPEE, P4147, DOI [10.1109/ICASSP40776.2020.9053627, 10.1109/icassp40776.2020.9053627]
[4]   EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation [J].
Chen, Hansheng ;
Wang, Pichao ;
Wang, Fan ;
Tian, Wei ;
Xiong, Lu ;
Li, Hao .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :2771-2780
[5]   FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [J].
Chen, Wei ;
Jia, Xi ;
Chang, Hyung Jin ;
Duan, Jinming ;
Shen, Linlin ;
Leonardis, Ales .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1581-1590
[6]   GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting [J].
Di, Yan ;
Zhang, Ruida ;
Lou, Zhiqiang ;
Manhardt, Fabian ;
Ji, Xiangyang ;
Navab, Nassir ;
Tombari, Federico .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :6771-6781
[7]   Zero-Shot 3D Pose Estimation of Unseen Object by Two-step RGB-D Fusion [J].
Duan, Guifang ;
Cheng, Shuai ;
Liu, Zhenyu ;
Zheng, Yanglun ;
Su, Yunhai ;
Tan, Jianrong .
NEUROCOMPUTING, 2024, 597
[8]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[9]   ShAPO: Implicit Representations for Multi-object Shape, Appearance, and Pose Optimization [J].
Irshad, Muhammad Zubair ;
Zakharov, Sergey ;
Ambrus, Rares ;
Kollar, Thomas ;
Kira, Zsolt ;
Gaidon, Adrien .
COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 :275-292
[10]  
Kothari N, 2017, 2017 INDIAN CONTROL CONFERENCE (ICC), P424, DOI 10.1109/INDIANCC.2017.7846512