CVML-Pose: Convolutional VAE Based Multi-Level Network for Object 3D Pose Estimation

被引:3
|
作者
Zhao, Jianyu [1 ]
Sanderson, Edward [1 ]
Matuszewski, Bogdan J. J. [1 ]
机构
[1] Univ Cent Lancashire, Comp Vis & Machine Learning CVML Grp, Preston PR1 2HE, England
基金
英国工程与自然科学研究理事会;
关键词
3D pose estimation; deep learning; variational autoencoder; synthetic data; 6D POSE;
D O I
10.1109/ACCESS.2023.3243551
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most vision-based 3D pose estimation approaches typically rely on knowledge of object's 3D model, depth measurements, and often require time-consuming iterative refinement to improve accuracy. However, these can be seen as limiting factors for broader real-life applications. The main motivation for this paper is to address these limitations. To solve this, a novel Convolutional Variational Auto-Encoder based Multi-Level Network for object 3D pose estimation (CVML-Pose) method is proposed. Unlike most other methods, the proposed CVML-Pose implicitly learns an object's 3D pose from only RGB images encoded in its latent space without knowing the object's 3D model, depth information, or performing a post-refinement. CVML-Pose consists of two main modules: (i) CVML-AE representing convolutional variational autoencoder, whose role is to extract features from RGB images, (ii) Multi-Layer Perceptron and K-Nearest Neighbor regressors mapping the latent variables to object 3D pose including, respectively, rotation and translation. The proposed CVML-Pose has been evaluated on the LineMod and LineMod-Occlusion benchmark datasets. It has been shown to outperform other methods based on latent representations and achieves comparable results to the state-of-the-art, but without use of a 3D model or depth measurements. Utilizing the t-Distributed Stochastic Neighbor Embedding algorithm, the CVML-Pose latent space is shown to successfully represent objects' category and topology. This opens up a prospect of integrated estimation of pose and other attributes (possibly also including surface finish or shape variations), which, with real-time processing due to the absence of iterative refinement, can facilitate various robotic applications. Code available: https://github.com/JZhao12/CVML-Pose.
引用
收藏
页码:13830 / 13845
页数:16
相关论文
共 50 条
  • [21] Towards Learning 3d Object Detection and 6d Pose Estimation from Synthetic Data
    Rudorfer, Martin
    Neumann, Lukas
    Krueger, Joerg
    2019 24TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2019, : 1540 - 1543
  • [22] Graph neural network for 6D object pose estimation
    Yin, Pengshuai
    Ye, Jiayong
    Lin, Guoshen
    Wu, Qingyao
    KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [23] 2D-3D pose consistency-based conditional random fields for 3D human pose estimation
    Chang, Ju Yong
    Lee, Kyoung Mu
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 169 : 52 - 61
  • [24] A 3D Object Recognition and Pose estimation System Using Deep Learning Method
    Liang, Dong
    Weng, Kaijian
    Wang, Can
    Liang, Guoyuan
    Chen, Haoyao
    Wu, Xinyu
    2014 4TH IEEE INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2014, : 401 - 404
  • [25] SARN: Shifted Attention Regression Network for 3D Hand Pose Estimation
    Zhu, Chenfei
    Hu, Boce
    Chen, Jiawei
    Ai, Xupeng
    Agrawal, Sunil K. K.
    BIOENGINEERING-BASEL, 2023, 10 (02):
  • [26] Multi-person 3D pose estimation from unlabelled data
    Daniel Rodriguez-Criado
    Pilar Bachiller-Burgos
    George Vogiatzis
    Luis J. Manso
    Machine Vision and Applications, 2024, 35
  • [27] 3D Human Pose Estimation Based on Volumetric Joint Coordinates
    Wan Y.
    Song Y.
    Liu L.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (09): : 1411 - 1419
  • [28] Review on 3D Hand Pose Estimation Based on a RGB Image
    Xiao Y.
    Liu Y.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (02): : 161 - 172
  • [29] A Multi-scale Recalibrated Approach for 3D Human Pose Estimation
    Xie, Ziwei
    Xia, Hailun
    Feng, Chunyan
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT III, 2019, 11441 : 400 - 411
  • [30] SMS3D: 3D Synthetic Mushroom Scenes Dataset for 3D Object Detection and Pose Estimation
    Zakeri, Abdollah
    Koirala, Bikram
    Kang, Jiming
    Balan, Venkatesh
    Zhu, Weihang
    Benhaddou, Driss
    Merchant, Fatima A.
    COMPUTERS, 2025, 14 (04)