InferTrans: Hierarchical structural fusion transformer for crowded human pose estimation

被引:0
|
作者
Li, Muyu [1 ,2 ]
Wang, Yingfeng [4 ]
Hu, Henan [3 ]
Zhao, Xudong [1 ,2 ]
机构
[1] Dalian Univ Technol, Inst Intelligent Sci & Technol, Sch Control Sci & Engn, Dalian 116024, Liaoning, Peoples R China
[2] Dalian Univ Technol, Key Lab Intelligent Control & Optimizat Ind Equipm, Minist Educ, Dalian 116024, Liaoning, Peoples R China
[3] Dalian Jiaotong Univ, Sch Mech Engn, Dalian 116028, Liaoning, Peoples R China
[4] Ctr Intelligent Multidimens Data Anal, Hong Kong Sci Pk, Hong Kong, Peoples R China
关键词
Human pose estimation; Occlusion handling; Transformer;
D O I
10.1016/j.inffus.2024.102878
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human pose estimation in crowded scenes presents unique challenges due to frequent occlusions and complex interactions between individuals. To address these issues, we introduce InferTrans, a hierarchical structural fusion Transformer designed to improve crowded human pose estimation. InferTrans integrates semantic features into structural information using a hierarchical joint-limb-semantic fusion module. By reorganizing joints and limbs into a tree structure, the fusion module facilitates effective information exchange across different structural levels, and leverage both global structural information and local contextual details. Furthermore, we explicitly model limb structural patterns separately from joints, treating limbs as vectors with defined lengths and orientations. This allows our model to infer complete human poses from minimal input, significantly enhancing pose refinement tasks. Extensive experiments on multiple datasets demonstrate that InferTrans outperforms existing pose estimation techniques in crowded and occluded scenarios. The proposed InferTrans serves as a robust post-processing technique, and is capable of improving the accuracy and robustness of pose estimation in challenging environments.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild
    Zhe Zhang
    Chunyu Wang
    Weichao Qiu
    Wenhu Qin
    Wenjun Zeng
    International Journal of Computer Vision, 2021, 129 : 703 - 718
  • [32] Fall Detection Using Transformer and Pose Estimation
    Aydogan, Nermin Nur
    Cengiz, Ertugrul
    Bal, Murat
    Eker, Onur
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [33] AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild
    Zhang, Zhe
    Wang, Chunyu
    Qiu, Weichao
    Qin, Wenhu
    Zeng, Wenjun
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (03) : 703 - 718
  • [34] HRPVT: High-Resolution Pyramid Vision Transformer for medium and small-scale human pose estimation
    Xu, Zhoujie
    Dai, Meng
    Zhang, Qing
    Jiang, Xiaodi
    NEUROCOMPUTING, 2025, 619
  • [35] Transformer-based weakly supervised 3D human pose estimation
    Wu, Xiao-guang
    Xie, Hu-jie
    Niu, Xiao-chen
    Wang, Chen
    Wang, Ze-lei
    Zhang, Shi-wen
    Shan, Yu-ze
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 109
  • [36] GTPT: Group-Based Token Pruning Transformer for Efficient Human Pose Estimation
    Wang, Haonan
    Liu, Jie
    Tang, Jie
    Wu, Gangshan
    Xu, Bo
    Chou, Yanbing
    Wang, Yong
    COMPUTER VISION - ECCV 2024, PT LXIX, 2025, 15127 : 213 - 230
  • [37] Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation
    Li, Wenhao
    Liu, Hong
    Ding, Runwei
    Liu, Mengyuan
    Wang, Pichao
    Yang, Wenming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1282 - 1293
  • [38] Combination of Deep Learner Network and Transformer for 3D Human Pose Estimation
    Tien-Dat Tran
    Xuan-Thuy Vo
    Duy-Linh Nguyen
    Jo, Kang-Hyun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 174 - 178
  • [39] Next-generation fall detection: harnessing human pose estimation and transformer technology
    Sykes, Edward R.
    HEALTH SYSTEMS, 2024,
  • [40] LOCAL TO GLOBAL TRANSFORMER FOR VIDEO BASED 3D HUMAN POSE ESTIMATION
    Ma, Haifeng
    Ke Lu
    Xue, Jian
    Niu, Zehai
    Gao, Pengcheng
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,