Learning Sequential Contexts using Transformer for 3D Hand Pose Estimation

被引：1

作者：

Khaleghi, Leyla ^{[1
,2
]}

Marshall, Joshua ^{[1
,2
]}

Etemad, Ali ^{[1
,2
]}

机构：

[1] Queens Univ Kingston, Dept ECE, Kingston, ON, Canada

[2] Queens Univ Kingston, Ingenu Labs, Res Inst, Kingston, ON, Canada

来源：

2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2022年

关键词：

D O I：

10.1109/ICPR56361.2022.9955633

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

3D hand pose estimation (HPE) is the process of locating the joints of the hand in 3D from any visual input. HPE has recently received an increased amount of attention due to its key role in a variety of human-computer interaction applications. Recent HPE methods have demonstrated the advantages of employing videos or multi-view images, allowing for more robust HPE systems. Accordingly, in this study, we propose a new method to perform Sequential learning with Transformer for Hand Pose (SeTHPose) estimation. Our SeTHPose pipeline begins by extracting visual embeddings from individual hand images. We then use a transformer encoder to learn the sequential context along time or viewing angles and generate accurate 21) hand joint locations. Then, a graph convolutional neural network with a U-Net configuration is used to convert the 2D hand joint locations to 3D poses. Our experiments show that SeTHPose performs well on both hand sequence varieties, temporal and angular. Also, SeTHPose outperforms other methods in the lield to achieve new state-of-the-art results on two public available sequential datasets, STB and MuViHand.

引用

页码：535 / 541

页数：7

共 50 条

[41] A Comprehensive Study on Deep Learning-Based 3D Hand Pose Estimation Methods
Chatzis, Theocharis
Stergioulas, Andreas
Konstantinidis, Dimitrios
Dimitropoulos, Kosmas
Daras, Petros
APPLIED SCIENCES-BASEL, 2020, 10 (19): : 1 - 27
[42] Dual-Path Transformer for 3D Human Pose Estimation
Zhou, Lu
Chen, Yingying
Wang, Jinqiao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3260 - 3270
[43] HOT-Net: Non-Autoregressive Transformer for 3D Hand-Object Pose Estimation
Huang, Lin
Tan, Jianchao
Meng, Jingjing
Liu, Ji
Yuan, Junsong
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3136 - 3145
[44] Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
Wen, Yilin
Pan, Hao
Yang, Lei
Pan, Jia
Komura, Taku
Wang, Wenping
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21243 - 21253
[45] 3D Human Pose Estimation With Adversarial Learning
Meng, Wenming
Hu, Tao
Shuai, Li
2019 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV), 2019, : 93 - 99
[46] DGFormer: Dynamic graph transformer for 3D human pose estimation
Chen, Zhangmeng
Dai, Ju
Bai, Junxuan
Pan, Junjun
PATTERN RECOGNITION, 2024, 152
[47] End-to-end 3D Human Pose Estimation with Transformer
Zhang, Bowei
Cui, Peng
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4529 - 4536
[48] GraFormer: Graph-oriented Transformer for 3D Pose Estimation
Zhao, Weixi
Wang, Weiqiang
Tian, Yunjie
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20406 - 20415
[49] Attention-Based Pose Sequence Machine for 3D Hand Pose Estimation
Guo, Fangtai
He, Zaixing
Zhang, Shuyou
Zhao, Xinyue
Tan, Jianrong
IEEE ACCESS, 2020, 8 : 18258 - 18269
[50] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
Peng, Sha
Hu, Jiwei
Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707

← 1 2 3 4 5 →