DBMHT: A double-branch multi-hypothesis transformer for 3D human pose estimation in video

被引:0
|
作者
Xiang, Xuezhi [1 ,2 ]
Li, Xiaoheng [1 ]
Bao, Weijie [1 ]
Qiaoa, Yulong [1 ,3 ]
El Saddik, Abdulmotaleb [3 ]
机构
[1] Harbin Engn Univ, Sch Informat & Commun Engn, Harbin 150001, Peoples R China
[2] Minist Ind & Informat Technol, Key Lab Adv Marine Commun & Informat Technol, Harbin 150001, Peoples R China
[3] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
基金
中国国家自然科学基金; 黑龙江省自然科学基金;
关键词
3D human pose estimation; Transformer; Dual-branch; Cross-hypothesis;
D O I
10.1016/j.cviu.2024.104147
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The estimation of 3D human poses from monocular videos presents a significant challenge. The existing methods face the problems of deep ambiguity and self-occlusion. To overcome these problems, we propose a Double-Branch Multi-Hypothesis Transformer (DBMHT). In detail, we utilize a Double-Branch architecture to capture temporal and spatial information and generate multiple hypotheses. To merge these hypotheses, we adopt a lightweight module to integrate spatial and temporal representations. The DBMHT can not only capture spatial information from each joint in the human body and temporal information from each frame in the video but also merge multiple hypotheses that have different spatio-temporal information. Comprehensive evaluation on two challenging datasets (i.e. Human3.6M and MPI-INF-3DHP) demonstrates the superior performance of DBMHT, marking it as a robust and efficient approach for accurate 3D HPE in dynamic scenarios. The results show that our model surpasses the state-of-the-art approach by 1.9% MPJPE with ground truth 2D keypoints as input.
引用
收藏
页数:8
相关论文
共 49 条
  • [31] LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION
    Chen, Ziyi
    Sugimoto, Akihiro
    Lai, Shang-Hong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4218 - 4222
  • [32] A Study on 3D Human Pose Estimation Using Through-Wall IR-UWB Radar and Transformer
    Kim, Gon Woo
    Lee, Sang Won
    Son, Ha Young
    Choi, Kae Won
    IEEE ACCESS, 2023, 11 : 15082 - 15095
  • [33] Dual-Branch Network with Online Knowledge Distillation for 3D Hand Pose Estimation
    He, Yingqi
    Li, Jinghua
    Kong, Dehui
    Yin, Baocai
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT III, 2024, 15018 : 130 - 143
  • [34] Occlusion Robust 3D Human Pose Estimation with StridedPoseGraphFormer and Data Augmentation
    Banik, Soubarna
    Gschossmann, Patricia
    Garcia, Alejandro Mendoza
    Knoll, Alois
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [35] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
    Xu, Xiangyu
    Liu, Lijuan
    Yan, Shuicheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
  • [36] 3D hand pose and mesh estimation via a generic Topology-aware Transformer model
    Yu, Shaoqi
    Wang, Yintong
    Chen, Lili
    Zhang, Xiaolin
    Li, Jiamao
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [37] SCALE-Pose: Skeletal Correction and Language Knowledge-assisted for 3D Human Pose Estimation
    Ma, Xinnan
    Li, Yaochen
    Zhao, Limeng
    Zhou, ChenXu
    Xu, Yuncheng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 : 578 - 592
  • [38] HDPose: Post-Hierarchical Diffusion with Conditioning for 3D Human Pose Estimation
    Lee, Donghoon
    Kim, Jaeho
    SENSORS, 2024, 24 (03)
  • [39] A Novel Auxiliary Task Framework in 3D Human Pose Estimation for Opera Videos
    Cai, Xingquan
    Zhang, Haoyu
    He, Shanshan
    Song, Haoyu
    Sun, Haiyan
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 202 - 210
  • [40] Global and Local Spatio-Temporal Encoder for 3D Human Pose Estimation
    Wang, Yong
    Kang, Hongbo
    Wu, Doudou
    Yang, Wenming
    Zhang, Longbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4039 - 4049