Enhanced Human Pose Estimation with Attention-Augmented HRNet

被引:1
作者
Zhang, Junjie [1 ]
Yang, Haojie [2 ]
Deng, Yancong [3 ]
机构
[1] Beijing Normal Univ, Hong Kong Baptist Univ, United Int Coll, Zhu Hai, Guangdong, Peoples R China
[2] Shanghai Jiao Tong Univ, Sch Math Sci, Shanghai, Peoples R China
[3] Univ Calif San Diego, Jacob Sch Engn, La Jolla, CA USA
来源
6TH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MACHINE VISION, IPMV 2024 | 2024年
关键词
Human Pose Estimation; HRNet; Attention Mechanism;
D O I
10.1145/3645259.3645274
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human pose estimation is a pivotal task in computer vision, aiming to predict the spatial locations of key body joints within an image accurately. The challenge arises from the need to understand complex human poses, occlusions, and variations in body configurations, which often perplex traditional pose estimation models. To bolster the accuracy and robustness of human pose estimation models, we introduce an Attention-Augmented HRNet Architecture. This proposed model augments the original HRNet by integrating self-attention mechanisms. These mechanisms capture long-range dependencies among keypoints and concentrate on pivotal body regions more effectively. Experimental results demonstrate that the Attention-Augmented HRNet surpasses the baseline HRNet that lacks attention, attaining state-of-the-art performance on the COCO dataset. Specifically, our model achieves an Average Precision (AP) of 74.5%.
引用
收藏
页码:88 / 93
页数:6
相关论文
共 10 条
[1]  
Albawi S, 2017, I C ENG TECHNOL
[2]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[3]  
[Anonymous], 2012, Human-computer interaction: An empirical research perspective
[4]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[5]  
Medsker L.R., 2001, INT SER COMPUTAT INT, V5, P2
[6]   Stacked Hourglass Networks for Human Pose Estimation [J].
Newell, Alejandro ;
Yang, Kaiyu ;
Deng, Jia .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :483-499
[7]   Deep High-Resolution Representation Learning for Human Pose Estimation [J].
Sun, Ke ;
Xiao, Bin ;
Liu, Dong ;
Wang, Jingdong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5686-5696
[8]  
Vaswani A, 2017, ADV NEUR IN, V30
[9]   Deep High-Resolution Representation Learning for Visual Recognition [J].
Wang, Jingdong ;
Sun, Ke ;
Cheng, Tianheng ;
Jiang, Borui ;
Deng, Chaorui ;
Zhao, Yang ;
Liu, Dong ;
Mu, Yadong ;
Tan, Mingkui ;
Wang, Xinggang ;
Liu, Wenyu ;
Xiao, Bin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) :3349-3364
[10]   CBAM: Convolutional Block Attention Module [J].
Woo, Sanghyun ;
Park, Jongchan ;
Lee, Joon-Young ;
Kweon, In So .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19