Multi-hop graph transformer network for 3D human pose estimation

被引:4
|
作者
Islam, Zaedul [1 ]
Ben Hamza, A. [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
3D human pose estimation; Graph convolutional network; Transformer; Multi-hop; Dilated convolution;
D O I
10.1016/j.jvcir.2024.104174
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D human pose estimation is a challenging task due to occlusion and depth ambiguity. In this paper, we introduce a multi -hop graph transformer network designed for 2D -to -3D human pose estimation in videos by leveraging the strengths of multi-head self-attention and multi -hop graph convolutional networks with disentangled neighborhoods to capture spatio-temporal dependencies and handle long-range interactions. The proposed network architecture consists of a graph attention block composed of stacked layers of multi-head self-attention and graph convolution with learnable adjacency matrix, and a multi -hop graph convolutional block comprised of multi -hop convolutional and dilated convolutional layers. The combination of multi-head self-attention and multi -hop graph convolutional layers enables the model to capture both local and global dependencies, while the integration of dilated convolutional layers enhances the model's ability to handle spatial details required for accurate localization of the human body joints. Extensive experiments demonstrate the effectiveness and generalization ability of our model, achieving competitive performance on benchmark datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] DGFormer: Dynamic graph transformer for 3D human pose estimation
    Chen, Zhangmeng
    Dai, Ju
    Bai, Junxuan
    Pan, Junjun
    PATTERN RECOGNITION, 2024, 152
  • [2] SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation
    Liang, Jiayao
    Yin, Mengxiao
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [3] Combination of Deep Learner Network and Transformer for 3D Human Pose Estimation
    Tien-Dat Tran
    Xuan-Thuy Vo
    Duy-Linh Nguyen
    Jo, Kang-Hyun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 174 - 178
  • [4] MULTI HYBRID EXTRACTOR NETWORK FOR 3D HUMAN POSE ESTIMATION
    Yuan, Zhixiang
    Zhang, Xitie
    Wu, Suping
    Zhang, Boyang
    Peng, Yuxin
    Wang, Bing
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3170 - 3174
  • [5] HOGFormer: high-order graph convolution transformer for 3D human pose estimation
    Xie, Yuhong
    Hong, Chaoqun
    Zhuang, Weiwei
    Liu, Lijuan
    Li, Jie
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (01) : 599 - 610
  • [6] Relation-balanced graph convolutional network for 3D human pose estimation
    Chen, Lu
    Liu, Qiong
    IMAGE AND VISION COMPUTING, 2023, 140
  • [7] Hybrid Attention MLP-Graph Network for 3D Human Pose Estimation
    Qiu, Feiyue
    Sun, Lin
    Peng, Delong
    Zhou, Jian
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATION, ICCEA 2024, 2024, : 243 - 247
  • [8] HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
    Cheng, Wencan
    Kim, Eunji
    Ko, Jong Hwan
    COMPUTER VISION - ECCV 2024, PT LXXXVIII, 2025, 15146 : 35 - 52
  • [9] Efficient Hierarchical Multi-view Fusion Transformer for 3D Human Pose Estimation
    Zhou, Kangkang
    Zhang, Lijun
    Lu, Feng
    Zhou, Xiang-Dong
    Shi, Yu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7512 - 7520
  • [10] Dual-Path Transformer for 3D Human Pose Estimation
    Zhou, Lu
    Chen, Yingying
    Wang, Jinqiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3260 - 3270