Temporally enhanced graph convolutional network for hand tracking from an egocentric camera

被引:0
作者
Cho, Woojin [1 ]
Ha, Taewook [1 ]
Jeon, Ikbeom [1 ]
Jeon, Jinwoo [1 ]
Kim, Tae-Kyun [3 ,4 ]
Woo, Woontack [1 ,2 ]
机构
[1] KAIST UVR Lab, 291 Daehak Ro, Daejeon 34141, South Korea
[2] KAIST KI ITC Augmented Real Res Ctr, 291 Daehak Ro, Daejeon 34141, South Korea
[3] KAIST CVL Lab, 291 Daehak Ro, Daejeon 34141, South Korea
[4] Imperial Coll London, Exhibit Rd, London SW7 2AZ, England
关键词
Augmented reality; Computer vision; Deep learning; Tracking; Head mounted displays;
D O I
10.1007/s10055-024-01039-3
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We propose a robust 3D hand tracking system in various hand action environments, including hand-object interaction, which utilizes a single color image and a previous pose prediction as input. We observe that existing methods deterministically exploit temporal information in motion space, failing to address realistic diverse hand motions. Also, prior methods paid less attention to efficiency as well as robust performance, i.e., the balance issues between time and accuracy. The Temporally Enhanced Graph Convolutional Network (TE-GCN) utilizes a 2-stage framework to encode temporal information adaptively. The system establishes balance by adopting an adaptive GCN, which effectively learns the spatial dependency between hand mesh vertices. Furthermore, the system leverages the previous prediction by estimating the relevance across image features through the attention mechanism. The proposed method achieves state-of-the-art balanced performance on challenging benchmarks and demonstrates robust results on various hand motions in real scenes. Moreover, the hand tracking system is integrated into a recent HMD with an off-loading framework, achieving a real-time framerate while maintaining high performance. Our study improves the usability of a high-performance hand-tracking method, which can be generalized to other algorithms and contributes to the usage of HMD in everyday life. Our code with the HMD project will be available at https://github.com/UVR-WJCHO/TEGCN_on_Hololens2.
引用
收藏
页数:18
相关论文
共 84 条
  • [1] Armagan Anil, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P85, DOI 10.1007/978-3-030-58592-1_6
  • [2] Weakly-supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects
    Baek, Seungryul
    Kim, Kwang In
    Kim, Tae-Kyun
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6120 - 6130
  • [3] Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering
    Baek, Seungryul
    Kim, Kwang In
    Kim, Tae-Kyun
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1067 - 1076
  • [4] 3D Hand Shape and Pose from Images in the Wild
    Boukhayma, Adnane
    de Bem, Rodrigo
    Torr, Philip H. S.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10835 - 10844
  • [5] Bruna J, 2014, Arxiv, DOI [arXiv:1312.6203, 10.48550/arXiv.1312.6203]
  • [6] Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
    Cai, Yujun
    Ge, Liuhao
    Liu, Jun
    Cai, Jianfei
    Cham, Tat-Jen
    Yuan, Junsong
    Thalmann, Nadia Magnenat
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2272 - 2281
  • [7] Reconstructing Hand-Object Interactions in the Wild
    Cao, Zhe
    Radosavovic, Ilija
    Kanazawa, Angjoo
    Malik, Jitendra
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12397 - 12406
  • [8] DexYCB: A Benchmark for Capturing Hand Grasping of Objects
    Chao, Yu-Wei
    Yang, Wei
    Xiang, Yu
    Molchanov, Pavlo
    Handa, Ankur
    Tremblay, Jonathan
    Narang, Yashraj S.
    Van Wyk, Karl
    Iqbal, Umar
    Birchfield, Stan
    Kautz, Jan
    Fox, Dieter
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9040 - 9049
  • [9] Temporal-Aware Self-Supervised Learning for 3D Hand Pose and Mesh Estimation in Videos
    Chen, Liangjian
    Lin, Shih-Yao
    Xie, Yusheng
    Lin, Yen-Yu
    Xie, Xiaohui
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1049 - 1058
  • [10] MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image
    Chen, Xingyu
    Liu, Yufeng
    Dong, Yajiao
    Zhang, Xiong
    Ma, Chongyang
    Xiong, Yanmin
    Zhang, Yuan
    Guo, Xiaoyan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20512 - 20522