LGCANet: lightweight hand pose estimation network based on HRNet

被引:0
作者
Pan, Xiaoying [1 ,2 ]
Li, Shoukun [1 ,2 ]
Wang, Hao [3 ,4 ]
Wang, Beibei [1 ,2 ]
Wang, Haoyi [5 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Comp Sci & Technol, Xian 710121, Shaanxi, Peoples R China
[2] Xian Univ Posts & Telecommun, Shaanxi Key Lab Network Data Anal & Intelligent Pr, Xian 710121, Shaanxi, Peoples R China
[3] Northwestern Polytech Univ, Sch Software, Xian 710072, Shaanxi, Peoples R China
[4] Northwestern Polytech Univ, Natl Engn Lab Air Earth Sea Integrat Big Data Appl, Xian 710121, Shaanxi, Peoples R China
[5] Southwest Univ, Westa Coll, Chongqing, Peoples R China
关键词
Hand pose estimation; High-resolution network; Multi-scale feature fusion; Lightweight network;
D O I
10.1007/s11227-024-06226-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hand pose estimation is a fundamental task in computer vision with applications in virtual reality, gesture recognition, autonomous driving, and virtual surgery. Keypoint detection often relies on deep learning methods and high-resolution feature map representations to achieve accurate detection. The HRNet framework serves as the basis, but it presents challenges in terms of extensive parameter count and demanding computational complexity due to high-resolution representations. To mitigate these challenges, we propose a lightweight keypoint detection network called LGCANet (Lightweight Ghost-Coordinate Attention Network). This network primarily consists of a lightweight feature extraction head for initial feature extraction and multiple lightweight foundational network modules called GCAblocks. GCAblocks introduce linear transformations to generate redundant feature maps while concurrently considering inter-channel relationships and long-range positional information using a coordinate attention mechanism. Validation on the RHD dataset and the COCO-WholeBody-Hand dataset shows that LGCANet reduces the number of parameters by 65.9% and GFLOPs by 72.6% while preserving the accuracy and improves the detection speed.
引用
收藏
页码:19351 / 19373
页数:23
相关论文
共 32 条
[1]   I2UV-HandNet: Image-to-UV Prediction Network for Accurate and High-fidelity 3D Hand Mesh Modeling [J].
Chen, Ping ;
Chen, Yujin ;
Yang, Dong ;
Wu, Fangyin ;
Li, Qin ;
Xia, Qingpei ;
Tan, Yong .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :12909-12918
[2]   MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image [J].
Chen, Xingyu ;
Liu, Yufeng ;
Dong, Yajiao ;
Zhang, Xiong ;
Ma, Chongyang ;
Xiong, Yanmin ;
Zhang, Yuan ;
Guo, Xiaoyan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20512-20522
[3]   Opportunities and Challenges of Virtual Reality in Healthcare - A Domain Experts Inquiry [J].
Halbig, Andreas ;
Babu, Sooraj K. ;
Gatter, Shirin ;
Latoschik, Marc Erich ;
Brukamp, Kirsten ;
von Mammen, Sebastian .
FRONTIERS IN VIRTUAL REALITY, 2022, 3
[4]   GhostNet: More Features from Cheap Operations [J].
Han, Kai ;
Wang, Yunhe ;
Tian, Qi ;
Guo, Jianyuan ;
Xu, Chunjing ;
Xu, Chang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1577-1586
[5]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[6]   Coordinate Attention for Efficient Mobile Network Design [J].
Hou, Qibin ;
Zhou, Daquan ;
Feng, Jiashi .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13708-13717
[7]  
Howard A. G., 2017, ARXIV
[8]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[9]  
Jin S, 2020, EUR C COMP VIS, P196, DOI DOI 10.1007/978-3-030-58545-7_12
[10]  
LI M, 2022, P IEEE CVF C COMP VI, P2761