High-Resolution Multi-Scale Feature Fusion Network for Running Posture Estimation

被引:3
作者
Xu, Xiaobing [1 ]
Zhang, Yaping [1 ]
机构
[1] Yunnan Normal Univ, Sch Informat Sci & Technol, Kunming 650500, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 07期
关键词
human pose estimation; joint occlusion; multi-branch network; multi-scale features; running posture estimation; DEEP CONVOLUTIONAL NETWORKS;
D O I
10.3390/app14073065
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Running posture estimation is a specialized task in human pose estimation that has received relatively little research attention due to the lack of appropriate datasets. To address this issue, this paper presents the construction of a new benchmark dataset called "Running Human", which was specifically designed for running sports. This dataset contains over 1000 images along with comprehensive annotations for 1288 instances of running humans, including bounding boxes and keypoint annotations on the human body. Additionally, a Receptive Field Spatial Pooling (RFSP) module was developed to tackle the challenge of joint occlusion, which is common in running sports images. This module was incorporated into the High-Resolution Network (HRNet) model, resulting in a novel network model named the Running Human Posture Network (RHPNet). By expanding the receptive field and effectively utilizing multi-scale features extracted from the multi-branch network, the RHPNet model significantly enhances the accuracy of running posture estimation. On the Running Human dataset, the proposed method achieved state-of-the-art performance. Furthermore, experiments were conducted on two benchmark datasets. Compared to the state-of-the-art ViTPose-L method, when applied to the COCO dataset, RHPNet demonstrated comparable prediction accuracy while utilizing only one tenth of the parameters and one eighth of the floating-point operations (FLOPs). On the MPII dataset, RHPNet achieves a PCKh@0.5 score of 92.0, which is only 0.5 points lower than the state-of-the-art method, PCT. These experimental results provide strong validation for the effectiveness and excellent generalization ability of the proposed method.
引用
收藏
页数:17
相关论文
共 36 条
[1]  
Alejandro N., 2016, P COMPUTER VISION EC
[2]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[3]  
Artacho B, 2021, Arxiv, DOI arXiv:2103.10180
[4]  
Bin Y., 2020, P COMPUTER VISION EC
[5]  
Cai Y., 2020, P COMPUTER VISION EC
[6]  
Chen PG, 2024, Arxiv, DOI arXiv:2001.04086
[7]  
Cheng B., 2020, IEEE C COMP VIS PATT, DOI DOI 10.48550/ARXIV.1908.10357
[8]   Human Pose as Compositional Tokens [J].
Geng, Zigang ;
Wang, Chunyu ;
Wei, Yixuan ;
Liu, Ze ;
Li, Houqiang ;
Hu, Han .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :660-671
[9]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916
[10]  
He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]