An optimization high-resolution network for human pose recognition based on attention mechanism

被引：0

作者：

Jinlong Yang

Yu Feng

机构：

[1] Jiangnan University,School of Artificial Intelligence and Computer Science

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Human pose estimation; Deep neural network; High resolution network (HRNet); Dilated convolution (DC); Attention mechanism;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In the high-resolution network (HRNet), the low layer of low resolution part can adopt shallow parallel network structure to maintain the high-resolution features and highlight global features. However, the high-resolution human posture estimation network has the problems of large amount of network parameters, high complex calculation and low recognition precision of similar actions. To solve these problems, we proposed an optimized HRNet based on attention mechanism. Firstly, the dilated convolution (DC) module is introduced into cross-channel sampling to obtain global features by increasing the receptive field of the feature map, which ensures that the feature map can cover all the information of the original image; Secondly, the channel attention Squeeze-and-Excitation (SE) module is introduced in the process of cross-channel feature fusion to learn the correlations, which can recalibrate the features, highlight the information features selectively and suppress the secondary features, improving the recognition precision without changing the parameter quantity and operation complexity; Finally, the experiment results on KTH dataset show that the HRNet with channel attention mechanism and dilated convolution has better accuracy.

引用

页码：45535 / 45552

页数：17

共 28 条

[1]

Peng C(2015)Pose Estimation Using Local Adjustment with Mixtures-of-parts Models J Fiber Bioeng Informat 8 249-258

[2]

Dalal N(2005)Histograms of oriented gradients for human detection Proc IEEE Comput Soc Conference Comput Vis Pattern Recog 1 886-893

[3]

Triggs B(2021)Deep spatiotemporal LSTM network with temporal pattern feature for 3D human action recognition Comput Intell 99 11-23

[4]

Wu Y(2018)Evaluating a bag-of-visual features approach using spatio-temporal features for action recognition Comput Electric Eng 72 660-669

[5]

Wei L(2020)Histogram of oriented gradient-based fusion of features for human action recognition in action video sequences Sensors 20 7299-669

[6]

Duan Y(2018)Evaluating a bag-of-visual features approach using spatio-temporal features for action recognition Computers & Electrical Engineering 72 660-18

[7]

Nazir S(2019)Going Deeper in Spiking Neural Networks: VGG and Residual Architectures Front Neurosci 13 95-1341

[8]

Yousaf MH(2016)Multi-scale context aggregation by dilated convolutions Proc Int Conf Learn Represent 11 122-848

[9]

Velastin SA(2017)Towards multi-person pose tracking: Bottom-up and top-down methods Proc IEEE Int Conf Comput Vis 2 7-undefined

[10]

Patel CI(2021)Improved human action recognition approach based on two-stream convolutional neural network model Vis Comput 37 1327-undefined

← 1 2 3 →