Temporally enhanced graph convolutional network for hand tracking from an egocentric camera

被引：0

作者：

Cho, Woojin ^{[1
]}

Ha, Taewook ^{[1
]}

Jeon, Ikbeom ^{[1
]}

Jeon, Jinwoo ^{[1
]}

Kim, Tae-Kyun ^{[3
,4
]}

Woo, Woontack ^{[1
,2
]}

机构：

[1] KAIST UVR Lab, 291 Daehak Ro, Daejeon 34141, South Korea

[2] KAIST KI ITC Augmented Real Res Ctr, 291 Daehak Ro, Daejeon 34141, South Korea

[3] KAIST CVL Lab, 291 Daehak Ro, Daejeon 34141, South Korea

[4] Imperial Coll London, Exhibit Rd, London SW7 2AZ, England

来源：

VIRTUAL REALITY | 2024年 / 28卷 / 03期

关键词：

Augmented reality; Computer vision; Deep learning; Tracking; Head mounted displays;

D O I：

10.1007/s10055-024-01039-3

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We propose a robust 3D hand tracking system in various hand action environments, including hand-object interaction, which utilizes a single color image and a previous pose prediction as input. We observe that existing methods deterministically exploit temporal information in motion space, failing to address realistic diverse hand motions. Also, prior methods paid less attention to efficiency as well as robust performance, i.e., the balance issues between time and accuracy. The Temporally Enhanced Graph Convolutional Network (TE-GCN) utilizes a 2-stage framework to encode temporal information adaptively. The system establishes balance by adopting an adaptive GCN, which effectively learns the spatial dependency between hand mesh vertices. Furthermore, the system leverages the previous prediction by estimating the relevance across image features through the attention mechanism. The proposed method achieves state-of-the-art balanced performance on challenging benchmarks and demonstrates robust results on various hand motions in real scenes. Moreover, the hand tracking system is integrated into a recent HMD with an off-loading framework, achieving a real-time framerate while maintaining high performance. Our study improves the usability of a high-performance hand-tracking method, which can be generalized to other algorithms and contributes to the usage of HMD in everyday life. Our code with the HMD project will be available at https://github.com/UVR-WJCHO/TEGCN_on_Hololens2.

引用

页数：18

共 84 条

[1] Armagan Anil, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P85, DOI 10.1007/978-3-030-58592-1_6
[2] Weakly-supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects
Baek, Seungryul
Kim, Kwang In
Kim, Tae-Kyun
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6120 - 6130
[3] Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering
Baek, Seungryul
Kim, Kwang In
Kim, Tae-Kyun
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1067 - 1076
[4] 3D Hand Shape and Pose from Images in the Wild
Boukhayma, Adnane
de Bem, Rodrigo
Torr, Philip H. S.
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10835 - 10844
[5] Bruna J, 2014, Arxiv, DOI [arXiv:1312.6203, 10.48550/arXiv.1312.6203]
[6] Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
Cai, Yujun
Ge, Liuhao
Liu, Jun
Cai, Jianfei
Cham, Tat-Jen
Yuan, Junsong
Thalmann, Nadia Magnenat
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2272 - 2281
[7] Reconstructing Hand-Object Interactions in the Wild
Cao, Zhe
Radosavovic, Ilija
Kanazawa, Angjoo
Malik, Jitendra
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12397 - 12406
[8] DexYCB: A Benchmark for Capturing Hand Grasping of Objects
Chao, Yu-Wei
Yang, Wei
Xiang, Yu
Molchanov, Pavlo
Handa, Ankur
Tremblay, Jonathan
Narang, Yashraj S.
Van Wyk, Karl
Iqbal, Umar
Birchfield, Stan
Kautz, Jan
Fox, Dieter
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9040 - 9049
[9] Temporal-Aware Self-Supervised Learning for 3D Hand Pose and Mesh Estimation in Videos
Chen, Liangjian
Lin, Shih-Yao
Xie, Yusheng
Lin, Yen-Yu
Xie, Xiaohui
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1049 - 1058
[10] MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image
Chen, Xingyu
Liu, Yufeng
Dong, Yajiao
Zhang, Xiong
Ma, Chongyang
Xiong, Yanmin
Zhang, Yuan
Guo, Xiaoyan
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20512 - 20522

← 1 2 3 4 5 6 7 8 9 →