CVML-Pose: Convolutional VAE Based Multi-Level Network for Object 3D Pose Estimation

被引：3

作者：

Zhao, Jianyu ^{[1
]}

Sanderson, Edward ^{[1
]}

Matuszewski, Bogdan J. J. ^{[1
]}

机构：

[1] Univ Cent Lancashire, Comp Vis & Machine Learning CVML Grp, Preston PR1 2HE, England

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

英国工程与自然科学研究理事会;

关键词：

3D pose estimation; deep learning; variational autoencoder; synthetic data; 6D POSE;

D O I：

10.1109/ACCESS.2023.3243551

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Most vision-based 3D pose estimation approaches typically rely on knowledge of object's 3D model, depth measurements, and often require time-consuming iterative refinement to improve accuracy. However, these can be seen as limiting factors for broader real-life applications. The main motivation for this paper is to address these limitations. To solve this, a novel Convolutional Variational Auto-Encoder based Multi-Level Network for object 3D pose estimation (CVML-Pose) method is proposed. Unlike most other methods, the proposed CVML-Pose implicitly learns an object's 3D pose from only RGB images encoded in its latent space without knowing the object's 3D model, depth information, or performing a post-refinement. CVML-Pose consists of two main modules: (i) CVML-AE representing convolutional variational autoencoder, whose role is to extract features from RGB images, (ii) Multi-Layer Perceptron and K-Nearest Neighbor regressors mapping the latent variables to object 3D pose including, respectively, rotation and translation. The proposed CVML-Pose has been evaluated on the LineMod and LineMod-Occlusion benchmark datasets. It has been shown to outperform other methods based on latent representations and achieves comparable results to the state-of-the-art, but without use of a 3D model or depth measurements. Utilizing the t-Distributed Stochastic Neighbor Embedding algorithm, the CVML-Pose latent space is shown to successfully represent objects' category and topology. This opens up a prospect of integrated estimation of pose and other attributes (possibly also including surface finish or shape variations), which, with real-time processing due to the absence of iterative refinement, can facilitate various robotic applications. Code available: https://github.com/JZhao12/CVML-Pose.

引用

页码：13830 / 13845

页数：16

共 50 条

[41] 3D hand pose estimation from a single RGB image through semantic decomposition of VAE latent space
Xinru Guo
Song Xu
Xiangbo Lin
Yi Sun
Xiaohong Ma
Pattern Analysis and Applications, 2022, 25 : 157 - 167
[42] 3D hand pose estimation from a single RGB image through semantic decomposition of VAE latent space
Guo, Xinru
Xu, Song
Lin, Xiangbo
Sun, Yi
Ma, Xiaohong
PATTERN ANALYSIS AND APPLICATIONS, 2022, 25 (01) : 157 - 167
[43] A 3D Camera Protocol for Object Pose Estimation from Point Cloud in Robot Operations
Charngtong, Chiwin
Dheeravongkit, Arbtip
Vonzbunvona, Sunachai
2024 21ST INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING, JCSSE 2024, 2024, : 9 - 15
[44] TAPoseNet: Teeth Alignment Based on Pose Estimation via Multi-scale Graph Convolutional Network
Deng, Qingxin
Yang, Xunyu
Huang, Minghan
Jiang, Landu
Zhang, Dian
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 314 - 323
[45] 3D Fetal Pose Estimation with Adaptive Variance and Conditional Generative Adversarial Network
Xu, Junshen
Zhang, Molin
Turk, Esra Abaci
Grant, P. Ellen
Golland, Polina
Adalsteinsson, Elfar
MEDICAL ULTRASOUND, AND PRETERM, PERINATAL AND PAEDIATRIC IMAGE ANALYSIS, ASMUS 2020, PIPPI 2020, 2020, 12437 : 201 - 210
[46] Learning a deep network with spherical part model for 3D hand pose estimation
Chen, Tzu-Yang
Ting, Pai-Wen
Wu, Min-Yu
Fu, Li-Chen
PATTERN RECOGNITION, 2018, 80 : 1 - 20
[47] Fetal Pose Estimation in Volumetric MRI Using a 3D Convolution Neural Network
Xu, Junshen
Zhang, Molin
Turk, Esra Abaci
Zhang, Larry
Grant, P. Ellen
Ying, Kui
Golland, Polina
Adalsteinsson, Elfar
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 403 - 410
[48] 3D Single Person Pose Estimation Method Based on Deep Learning
Yuan, Xinrui
Wang, Hairong
Wang, Jun
FUZZY SYSTEMS AND DATA MINING VI, 2020, 331 : 481 - 491
[49] Coarse-to-Fine 3D Human Pose Estimation
Guo, Yu
Zhao, Lin
Zhang, Shanshan
Yang, Jian
IMAGE AND GRAPHICS, ICIG 2019, PT III, 2019, 11903 : 579 - 592
[50] GoPose: 3D Human Pose Estimation Using WiFi
Ren, Yili
Wang, Zi
Wang, Yichao
Tan, Sheng
Chen, Yingying
Yang, Jie
PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2022, 6 (02):

← 1 2 3 4 5 →