Bottom-up 2D pose estimation via dual anatomical centers for small-scale persons

被引：9

作者：

Cheng, Yu ^{[1
]}

Ai, Yihao ^{[1
]}

Wang, Bo ^{[2
]}

Wang, Xinchao ^{[1
]}

Tan, Robby T. ^{[1
]}

机构：

[1] Natl Univ Singapore, Elect & Comp Engn, Singapore, Singapore

[2] CtrsVision, Richfield, UT USA

来源：

PATTERN RECOGNITION | 2023年 / 139卷

关键词：

Multi -person pose estimation; Human pose estimation; Anatomical centers;

D O I：

10.1016/j.patcog.2023.109403

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all per-sons, and unlike the top-down methods, do not rely on human detection. However, the SOTA bottom -up methods' accuracy is still inferior compared to the existing top-down methods. This is due to the predicted human poses being regressed based on the inconsistent human bounding box center and the lack of human-scale normalization, leading to the predicted human poses being inaccurate and small-scale persons being missed. To push the envelope of the bottom-up pose estimation, we firstly propose multi-scale training to enhance the network to handle scale variation with single-scale testing, particu-larly for small-scale persons. Secondly, we introduce dual anatomical centers (i.e., head and body), where we can predict the human poses more accurately and reliably, especially for small-scale persons. More-over, existing bottom-up methods use multi-scale testing to boost the accuracy of pose estimation at the price of multiple additional forward passes, which weakens the efficiency of bottom-up methods, the core strength compared to top-down methods. By contrast, our multi-scale training enables the model to predict high-quality poses in a single forward pass (i.e., single-scale testing). Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA) on the challenging small-scale persons subset of COCO. For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing. We also achieve the top performance (40.3 AP) on the OCHuman dataset in cross-dataset evaluation.(c) 2023 Elsevier Ltd. All rights reserved.

引用

页数：13

共 52 条

[1] Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Tracking [J].

Bao, Qian ;

Liu, Wu ;

Cheng, Yuhao ;

Zhou, Boyan ;

Mei, Tao .

IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 :161-175

[2] Recurrent Human Pose Estimation [J].

Belagiannis, Vasileios ;

Zisserman, Andrew .

2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, :468-475

[3] Structure-aware human pose estimation with graph convolutional networks [J].

Bin, Yanrui ;

Chen, Zhao-Min ;

Wei, Xiu-Shen ;

Chen, Xinya ;

Gao, Changxin ;

Sang, Nong .

PATTERN RECOGNITION, 2020, 106

[4] The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation [J].

Braso, Guillem ;

Kister, Nikita ;

Leal-Taixe, Laura .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11833-11843

[5] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].

Cao, Zhe ;

Hidalgo, Gines ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186

[6] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

[7] Cascaded Pyramid Network for Multi-Person Pose Estimation [J].

Chen, Yilun ;

Wang, Zhicheng ;

Peng, Yuxiang ;

Zhang, Zhiqiang ;

Yu, Gang ;

Sun, Jian .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112

[8] HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation [J].

Cheng, Bowen ;

Xiao, Bin ;

Wang, Jingdong ;

Shi, Honghui ;

Huang, Thomas S. ;

Zhang, Lei .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5385-5394

[9]

Cheng Y., 2022, IEEE T PATTERN ANAL

[10] A review of 3D human pose estimation algorithms for markerless motion capture [J].

Desmarais, Yann ;

Mottet, Denis ;

Slangen, Pierre ;

Montesinos, Philippe .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 212

← 1 2 3 4 5 6 →