GTPT: Group-Based Token Pruning Transformer for Efficient Human Pose Estimation

被引:0
作者
Wang, Haonan [1 ,2 ]
Liu, Jie [1 ]
Tang, Jie [1 ]
Wu, Gangshan [1 ]
Xu, Bo [2 ]
Chou, Yanbing [2 ]
Wang, Yong [2 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Cainiao Network, Hangzhou, Peoples R China
来源
COMPUTER VISION - ECCV 2024, PT LXIX | 2025年 / 15127卷
关键词
Efficient human pose estimation; Whole-body pose estimation; Transformer; Token pruning; Group;
D O I
10.1007/978-3-031-72890-7_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, 2D human pose estimation has made significant progress on public benchmarks. However, many of these approaches face challenges of less applicability in the industrial community due to the large number of parametric quantities and computational overhead. Efficient human pose estimation remains a hurdle, especially for whole-body pose estimation with numerous keypoints. While most current methods for efficient human pose estimation primarily rely on CNNs, we propose the Group-based Token Pruning Transformer (GTPT) that fully harnesses the advantages of the Transformer. GTPT alleviates the computational burden by gradually introducing keypoints in a coarse-to-fine manner. It minimizes the computation overhead while ensuring high performance. Besides, GTPT groups keypoint tokens and prunes visual tokens to improve model performance while reducing redundancy. We propose the Multi-Head Group Attention (MHGA) between different groups to achieve global interaction with little computational overhead. We conducted experiments on COCO and COCO-WholeBody. Compared to other methods, the experimental results show that GTPT can achieve higher performance with less computation, especially in whole-body with numerous keypoints.
引用
收藏
页码:213 / 230
页数:18
相关论文
共 51 条
  • [1] Alexey D, 2020, arXiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
  • [2] Bukschat Y, 2020, Arxiv, DOI arXiv:2011.04307
  • [3] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
  • [4] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
    Cao, Zhe
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
  • [5] Chen H., 2022, arXiv, DOI DOI 10.48550/ARXIV.2204.07370
  • [6] Cheng B, 2021, ADV NEUR IN, V34
  • [7] AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time
    Fang, Hao-Shu
    Li, Jiefeng
    Tang, Hongyang
    Xu, Chao
    Zhu, Haoyi
    Xiu, Yuliang
    Li, Yong-Lu
    Lu, Cewu
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7157 - 7173
  • [8] Garau N., 2021, arXiv
  • [9] Single-Network Whole-Body Pose Estimation
    Hidalgo, Gines
    Raaj, Yaadhav
    Idrees, Haroon
    Xiang, Donglai
    Joo, Hanbyul
    Simon, Tomas
    Sheikh, Yaser
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6981 - 6990
  • [10] Hinton G, 2015, Arxiv, DOI arXiv:1503.02531