Lightweight and efficient human pose estimation with enhanced priori skeleton structure

被引:0
|
作者
Sun X. [1 ]
Zhang R. [1 ]
Guan X. [1 ]
Li Q. [1 ]
机构
[1] School of Microelectronics, Tianjin University, Tianjin
来源
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science) | 2024年 / 58卷 / 01期
关键词
convolution direction enhancement; deep learning; human pose estimation; keypoints detection; postural enhancement;
D O I
10.3785/j.issn.1008-973X.2024.01.006
中图分类号
学科分类号
摘要
A lightweight and efficient human pose estimation method with an enhanced priori skeleton structure was proposed to better utilize the unique distribution properties of human pose keypoints. The high-resolution network was used to preserve spatial location information better. The lightweight inverse residual module was employed to reduce the number of model parameters. The postural enhancement module was designed to strengthen the priori information of human pose and the connection between human pose keypoints using global spatial feature information and context information. The direction-enhanced convolution module was proposed to address the problem of missing spatial feature information of keypoints caused by blurred pixel positions and directional shifts of convolution kernel optimization when fusing multi-resolution feature images. The prior distribution of keypoints was combined by utilizing the properties of the horizontal and vertical directions of the keypoints on the torso. The experimental results demonstrate that the network can efficiently estimate human pose. The model achieves an average precision score of 78.4 on the COCO test-dev set and reduces the number of parameters by 17.4×106 compared with the benchmark network, balancing accuracy and efficiency. © 2024 Zhejiang University. All rights reserved.
引用
收藏
页码:50 / 60
页数:10
相关论文
共 26 条
  • [1] REIS E S, SEEWALD L A, ANTUNES R S, Et al., Monocular multi-person pose estimation: a survey, Pattern Recognition, 118, (2021)
  • [2] NEWELL A, YANG K, DENG J., Stacked hourglass networks for human pose estimation [C], European Conference on Computer Vision, pp. 483-499, (2016)
  • [3] CHEN Y, WANG Z, PENG Y, Et al., Cascaded pyramid network for multi-person pose estimation [C], IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103-7112, (2018)
  • [4] XIAO B, WU H, WEI Y., Simple baselines for human pose estimation and tracking [C], European Conference on Computer Vision, pp. 472-487, (2018)
  • [5] SUN K, XIAO B, LIU D, Et al., Deep high-resolution representation learning for human pose estimation [C], IEEE Conference on Computer Vision and Pattern Recognition, pp. 5686-5696, (2019)
  • [6] SANDLER M, HOWARD A, ZHU M, Et al., MobileNetV2: inverted residuals and linear bottlenecks [C], IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510-4520, (2018)
  • [7] ZHANG X, ZHOU X, LIN M, Et al., ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C], IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848-6856, (2018)
  • [8] QIAO S, CHEN L C, YUILLE A., DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution [C], IEEE Conference on Computer Vision and Pattern Recognition, pp. 10208-10219, (2021)
  • [9] LIN T Y, DOLLAR P, GIRSHICK R, Et al., Feature pyramid networks for object detection [J], IEEE Computer Society, 1, pp. 936-944, (2017)
  • [10] SU H, JAMPANI V, SUN D, Et al., Pixel-adaptive convolutional neural networks [C], IEEE Conference on Computer Vision and Pattern Recognition, pp. 11158-11167, (2019)