Multi-hop graph transformer network for 3D human pose estimation

被引:4
作者
Islam, Zaedul [1 ]
Ben Hamza, A. [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
3D human pose estimation; Graph convolutional network; Transformer; Multi-hop; Dilated convolution;
D O I
10.1016/j.jvcir.2024.104174
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D human pose estimation is a challenging task due to occlusion and depth ambiguity. In this paper, we introduce a multi -hop graph transformer network designed for 2D -to -3D human pose estimation in videos by leveraging the strengths of multi-head self-attention and multi -hop graph convolutional networks with disentangled neighborhoods to capture spatio-temporal dependencies and handle long-range interactions. The proposed network architecture consists of a graph attention block composed of stacked layers of multi-head self-attention and graph convolution with learnable adjacency matrix, and a multi -hop graph convolutional block comprised of multi -hop convolutional and dilated convolutional layers. The combination of multi-head self-attention and multi -hop graph convolutional layers enables the model to capture both local and global dependencies, while the integration of dilated convolutional layers enhances the model's ability to handle spatial details required for accurate localization of the human body joints. Extensive experiments demonstrate the effectiveness and generalization ability of our model, achieving competitive performance on benchmark datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] HPGCN: Hierarchical poselet-guided graph convolutional network for 3D pose estimation
    Wu, Yongpeng
    Kong, Dehui
    Wang, Shaofan
    Li, Jinghua
    Yin, Baocai
    NEUROCOMPUTING, 2022, 487 : 243 - 256
  • [32] Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation
    Wu, Yongpeng
    Kong, Dehui
    Gao, Junna
    Li, Jinghua
    Yin, Baocai
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
  • [33] ESMformer: Error-aware self-supervised transformer for multi-view 3D human pose estimation
    Zhang, Lijun
    Zhou, Kangkang
    Lu, Feng
    Li, Zhenghao
    Shao, Xiaohu
    Zhou, Xiang-Dong
    Shi, Yu
    PATTERN RECOGNITION, 2025, 158
  • [34] DMGAN: Dynamic Multi-Hop Graph Attention Network for Traffic Forecasting
    Li, Rui
    Zhang, Fan
    Li, Tong
    Zhang, Ning
    Zhang, Tingting
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 9088 - 9101
  • [35] Parallel-branch network for 3D human pose and shape estimation in video
    Wu, Yuanhao
    Wang, Chenxing
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [36] 3D HEAD POSE ESTIMATION BASED ON GRAPH CONVOLUTIONAL NETWORK FROM A SINGLE RGB IMAGE
    Lie, Wen-Nung
    Yim, Monyneath
    Aing, Lee
    Chiang, Jui-Chiu
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3963 - 3967
  • [37] DDBMHT: A Diffusion-Based Double-Branch Multi-Hypothesis Transformer for 3D Human Pose Estimation in Video
    Bao, Weijie
    Xiang, Xuezhi
    2024 9TH INTERNATIONAL CONFERENCE ON ELECTRONIC TECHNOLOGY AND INFORMATION SCIENCE, ICETIS 2024, 2024, : 35 - 39
  • [38] Joint graph convolution networks and transformer for human pose estimation in sports technique analysis
    Cheng, Hongren
    Wang, Jing
    Zhao, Anran
    Zhong, Yaping
    Li, Jingli
    Dong, Liangshan
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (10)
  • [39] Human Pose Estimation Based on a Spatial Temporal Graph Convolutional Network
    Wu, Meng
    Shi, Pudong
    APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [40] Weakly-Supervised 3D Human Pose Estimation With Cross-View U-Shaped Graph Convolutional Network
    Hua, Guoliang
    Liu, Hong
    Li, Wenhao
    Zhang, Qian
    Ding, Runwei
    Xu, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1832 - 1843