Monocular 3D hand pose estimation based on high-resolution network

被引:0
作者
Li, Shengling [1 ]
Su, Wanjuan [2 ]
Luo, Guansheng [1 ]
Tian, Jinshan [1 ]
Han, Yifei [1 ]
Liu, Liman [1 ]
Tao, Wenbing [2 ]
机构
[1] South Cent Minzu Univ, State Ethn Comm, Sch Biomed Engn, Hubei Prov Key Lab Med Informat Anal & Tumor Diag, Wuhan 430074, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automation, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
来源
ADVANCES IN CONTINUOUS AND DISCRETE MODELS | 2025年 / 2025卷 / 01期
基金
中国国家自然科学基金;
关键词
Deep learning; Hand pose estimation; Hand mesh; Image features; RGB; TRACKING;
D O I
10.1186/s13662-025-03948-2
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Hand pose estimation from monocular images has been widely applied in many fields. However, the depth ambiguity of the hand and occlusion or lighting change caused by hand-object interaction make the accurate pose estimation challenge. Currently, many methods increase the depth of models to improve the accuracy of hand pose estimation networks, significantly increasing the number of model parameters and computational complexity. To solve this issue, we propose an improved High-Resolution Network (HRDNet) for hand pose estimation. Our method mainly consists of four stages: image feature extraction, 2D information prediction, 3D joint prediction, and hand mesh reconstruction. For image feature extraction, to enhance the network's multiscale representation capabilities, we propose a hand feature perception module and a lightweight basic module. For the 2D information prediction module, a hand pose estimation network is proposed to fully utilize the 2D information of the hand while predicting 2D joint heat maps. Additionally, we also propose a hand joint pose encoder to improve the prediction accuracy of key points in heat maps. Then, in the process of 3D joint prediction, a two-level cascaded pose estimation network is proposed to predict 3D joint coordinates. Finally, we take the Inverse Kinematics Network (IKNet) to regress pose and shape parameters for generating reconstructed hand mesh in the hand Model with Articulated and Nonrigid defOrmations (MANO) model. Our extensive experiments on two publicly available datasets have shown that our method achieves competitive results compared with state-of-the-art hand pose estimation methods.
引用
收藏
页数:24
相关论文
共 66 条
[1]   Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering [J].
Baek, Seungryul ;
Kim, Kwang In ;
Kim, Tae-Kyun .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1067-1076
[2]  
Ballan L, 2012, LECT NOTES COMPUT SC, V7577, P640, DOI 10.1007/978-3-642-33783-3_46
[3]   3D Hand Shape and Pose from Images in the Wild [J].
Boukhayma, Adnane ;
de Bem, Rodrigo ;
Torr, Philip H. S. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10835-10844
[4]   Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images [J].
Cai, Yujun ;
Ge, Liuhao ;
Cai, Jianfei ;
Yuan, Junsong .
COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 :678-694
[5]   Pose guided structured region ensemble network for cascaded hand pose estimation [J].
Chen, Xinghao ;
Wang, Guijin ;
Guo, Hengkai ;
Zhang, Cairong .
NEUROCOMPUTING, 2020, 395 :138-149
[6]   Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration [J].
Chen, Xingyu ;
Liu, Yufeng ;
Ma, Chongyang ;
Chang, Jianlong ;
Wang, Huayan ;
Chen, Tian ;
Guo, Xiaoyan ;
Wan, Pengfei ;
Zheng, Wen .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13269-13278
[7]   Joint-wise 2D to 3D lifting for hand pose estimation from a single RGB image [J].
Chen, Zheng ;
Sun, Yi .
APPLIED INTELLIGENCE, 2023, 53 (06) :6421-6431
[8]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[9]   HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation [J].
Doosti, Bardia ;
Naha, Shujon ;
Mirbagheri, Majid ;
Crandall, David J. .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6607-6616
[10]   Multi-Agent Deep Reinforcement Learning for Online 3D Human Poses Estimation [J].
Fan, Zhen ;
Li, Xiu ;
Li, Yipeng .
REMOTE SENSING, 2021, 13 (19)