Bottom-up 2D pose estimation via dual anatomical centers for small-scale persons

被引:9
作者
Cheng, Yu [1 ]
Ai, Yihao [1 ]
Wang, Bo [2 ]
Wang, Xinchao [1 ]
Tan, Robby T. [1 ]
机构
[1] Natl Univ Singapore, Elect & Comp Engn, Singapore, Singapore
[2] CtrsVision, Richfield, UT USA
关键词
Multi -person pose estimation; Human pose estimation; Anatomical centers;
D O I
10.1016/j.patcog.2023.109403
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all per-sons, and unlike the top-down methods, do not rely on human detection. However, the SOTA bottom -up methods' accuracy is still inferior compared to the existing top-down methods. This is due to the predicted human poses being regressed based on the inconsistent human bounding box center and the lack of human-scale normalization, leading to the predicted human poses being inaccurate and small-scale persons being missed. To push the envelope of the bottom-up pose estimation, we firstly propose multi-scale training to enhance the network to handle scale variation with single-scale testing, particu-larly for small-scale persons. Secondly, we introduce dual anatomical centers (i.e., head and body), where we can predict the human poses more accurately and reliably, especially for small-scale persons. More-over, existing bottom-up methods use multi-scale testing to boost the accuracy of pose estimation at the price of multiple additional forward passes, which weakens the efficiency of bottom-up methods, the core strength compared to top-down methods. By contrast, our multi-scale training enables the model to predict high-quality poses in a single forward pass (i.e., single-scale testing). Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA) on the challenging small-scale persons subset of COCO. For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing. We also achieve the top performance (40.3 AP) on the OCHuman dataset in cross-dataset evaluation.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 52 条
[1]   Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Tracking [J].
Bao, Qian ;
Liu, Wu ;
Cheng, Yuhao ;
Zhou, Boyan ;
Mei, Tao .
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 :161-175
[2]   Recurrent Human Pose Estimation [J].
Belagiannis, Vasileios ;
Zisserman, Andrew .
2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, :468-475
[3]   Structure-aware human pose estimation with graph convolutional networks [J].
Bin, Yanrui ;
Chen, Zhao-Min ;
Wei, Xiu-Shen ;
Chen, Xinya ;
Gao, Changxin ;
Sang, Nong .
PATTERN RECOGNITION, 2020, 106
[4]   The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation [J].
Braso, Guillem ;
Kister, Nikita ;
Leal-Taixe, Laura .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11833-11843
[5]   OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].
Cao, Zhe ;
Hidalgo, Gines ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186
[6]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[7]   Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[8]   HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation [J].
Cheng, Bowen ;
Xiao, Bin ;
Wang, Jingdong ;
Shi, Honghui ;
Huang, Thomas S. ;
Zhang, Lei .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5385-5394
[9]  
Cheng Y., 2022, IEEE T PATTERN ANAL
[10]   A review of 3D human pose estimation algorithms for markerless motion capture [J].
Desmarais, Yann ;
Mottet, Denis ;
Slangen, Pierre ;
Montesinos, Philippe .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 212