Spatial and contextual aware network based on multi-resolution for human pose estimation

被引：5

作者：

Zhang, Qingyu ^{[1
]}

Chen, Ying ^{[1
]}

机构：

[1] Jiangnan Univ, Minist Educ, Key Lab Adv Proc Control Light Ind, Wuxi 214122, Jiangsu, Peoples R China

来源：

VISUAL COMPUTER | 2023年 / 39卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Human pose estimation; Detail information; Global dependency; Contextual information;

D O I：

10.1007/s00371-021-02364-3

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Aiming at capturing high-resolution spatial information and rich contextual information for accurate positioning and inference of keypoints in the task of human pose estimation, a Spatial and Contextual Aware Network (SCANet) based on multi-resolution is proposed. The network is based on HRNet and extends it with three effective modules, namely Spatial Self-Attention Module (SSAM), Information Supplement Module (ISM) and Detail Enhancement Module (DEM). The SSAM is used to provide global dependency for local features by establishing spatial correlation between locations in feature maps. ISM is proposed to further enrich spatial information and refine local representation by skip connection and dilated convolution. DEM is designed to generate high-resolution features and compensate detail information for more precise prediction. The proposed method is better than most of the state-of-the-art methods, and experiments on two keypoint benchmarks, MPII and COCO, validating the effectiveness of the model.

引用

页码：651 / 662

页数：12

共 43 条

[1] Tensor Body: Real-time Reconstruction of the Human Body and Avatar Synthesis from RGB-D
Barmpoutis, Angelos
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (05) : 1347 - 1356
[2] Human Pose Estimation with Iterative Error Feedback
Carreira, Joao
Agrawal, Pulkit
Fragkiadaki, Katerina
Malik, Jitendra
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4733 - 4742
[3] Chen X., 2014, ADV NEURAL INFORM PR, P1736
[4] Cascaded Pyramid Network for Multi-Person Pose Estimation
Chen, Yilun
Wang, Zhicheng
Peng, Yuxiang
Zhang, Zhiqiang
Yu, Gang
Sun, Jian
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7103 - 7112
[5] HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
Cheng, Bowen
Xiao, Bin
Wang, Jingdong
Shi, Honghui
Huang, Thomas S.
Zhang, Lei
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5385 - 5394
[6] Chu X, 2017, PROC CVPR IEEE, P1831, DOI DOI 10.1109/CVPR.2017.601
[7] RMPE: Regional Multi-Person Pose Estimation
Fang, Hao-Shu
Xie, Shuqin
Tai, Yu-Wing
Lu, Cewu
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2353 - 2362
[8] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
[9] Glorot X., 2011, PROC 14 INT C ARTIF, V15, P315
[10] He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]

← 1 2 3 4 5 →