Human pose estimation is widely used in virtual reality, medical diagnosis, video surveillance, etc. However, the high computational cost restricts its application in some terminals, so the research of lightweight models is particularly important. To address the problem of inadequate learning of Lite-HRNet weight coefficients, this paper proposes Lite CSW-HRNet, which independently computes weight maps in parallel along channels and in space during single-resolution and cross-resolution weight computation, respectively, and fully preserves the original features using max pooling and average pooling in the computation process; Adaptive 1D convolution is introduced in the channel weight calculation to aggregate information between adjacent channels, avoiding the adverse effects of channel degradation. In the spatial weight calculation, a 7˟7 convolution is used to increase the perceptual field to aggregate a wider range of spatial contextual information. Comparative experiments on the COCO2017 dataset show that compared with Lite-HRNet, Lite CSW-HRNet further improves accuracy with decreasing both Params and FLOPs, and outperforms the state-of-the-art MobileNet and ShuffleNet in all metrics; compared with the large model HRNet, Params and FLOPs are about 1/30 of it, and AP can reach 90% of it, achieving a better balance result between accuracy and complexity of human pose estimation. © 2024, Taiwan Ubiquitous Information CO LTD. All rights reserved.