Lite CSW-HRNet: Lightweight High-Resolution Human Pose Estimation Based on Channel Spatial Weighting

被引:0
作者
Xi, Yang [1 ]
Zhang, Zi-Hao [1 ]
Meng, Si-Yu [2 ]
Fu, Jia [2 ]
Wu, Zhen-Yu [3 ]
Wang, Wen-Jing [1 ]
机构
[1] School of Computer Science, Northeast Electric Power University, Jilin,132012, China
[2] Yongji Power Supply Company State Grid Jilin Electric Power Co., Ltd, Jilin,132012, China
[3] Department of Orthopedics of Affiliated Hospital of Beihua University Beihua University, Jilin,132012, China
来源
Journal of Network Intelligence | 2024年 / 9卷 / 03期
关键词
Channel space weighting - Computational costs - Deep learning - High resolution - High-resolution network - Human pose estimations - ITS applications - Lightweight network - Video surveillance - Weight calculation;
D O I
暂无
中图分类号
学科分类号
摘要
Human pose estimation is widely used in virtual reality, medical diagnosis, video surveillance, etc. However, the high computational cost restricts its application in some terminals, so the research of lightweight models is particularly important. To address the problem of inadequate learning of Lite-HRNet weight coefficients, this paper proposes Lite CSW-HRNet, which independently computes weight maps in parallel along channels and in space during single-resolution and cross-resolution weight computation, respectively, and fully preserves the original features using max pooling and average pooling in the computation process; Adaptive 1D convolution is introduced in the channel weight calculation to aggregate information between adjacent channels, avoiding the adverse effects of channel degradation. In the spatial weight calculation, a 7˟7 convolution is used to increase the perceptual field to aggregate a wider range of spatial contextual information. Comparative experiments on the COCO2017 dataset show that compared with Lite-HRNet, Lite CSW-HRNet further improves accuracy with decreasing both Params and FLOPs, and outperforms the state-of-the-art MobileNet and ShuffleNet in all metrics; compared with the large model HRNet, Params and FLOPs are about 1/30 of it, and AP can reach 90% of it, achieving a better balance result between accuracy and complexity of human pose estimation. © 2024, Taiwan Ubiquitous Information CO LTD. All rights reserved.
引用
收藏
页码:1641 / 1656
相关论文
empty
未找到相关数据