3D hand pose and shape estimation from monocular RGB via efficient 2D cues

被引：2

作者：

Zhang, Fenghao ^{[1
]}

Zhao, Lin ^{[2
]}

Li, Shengling ^{[1
]}

Su, Wanjuan ^{[2
]}

Liu, Liman ^{[1
]}

Tao, Wenbing ^{[2
]}

机构：

[1] South Cent Minzu Univ, Sch Biomed Engn, Hubei Key Lab Med Informat Anal & Tumor Diag & Tre, Wuhan 430074, Peoples R China

[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China

来源：

COMPUTATIONAL VISUAL MEDIA | 2024年 / 10卷 / 01期

基金：

中国国家自然科学基金;

关键词：

hand; 3D reconstruction; deep learning; image features; 3D mesh;

D O I：

10.1007/s41095-023-0346-4

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Estimating 3D hand shape from a single-view RGB image is important for many applications. However, the diversity of hand shapes and postures, depth ambiguity, and occlusion may result in pose errors and noisy hand meshes. Making full use of 2D cues such as 2D pose can effectively improve the quality of 3D human hand shape estimation. In this paper, we use 2D joint heatmaps to obtain spatial details for robust pose estimation. We also introduce a depth-independent 2D mesh to avoid depth ambiguity in mesh regression for efficient hand-image alignment. Our method has four cascaded stages: 2D cue extraction, pose feature encoding, initial reconstruction, and reconstruction refinement. Specifically, we first encode the image to determine semantic features during 2D cue extraction; this is also used to predict hand joints and for segmentation. Then, during the pose feature encoding stage, we use a hand joints encoder to learn spatial information from the joint heatmaps. Next, a coarse 3D hand mesh and 2D mesh are obtained in the initial reconstruction step; a mesh squeeze-and-excitation block is used to fuse different hand features to enhance perception of 3D hand structures. Finally, a global mesh refinement stage learns non-local relations between vertices of the hand mesh from the predicted 2D mesh, to predict an offset hand mesh to fine-tune the reconstruction results. Quantitative and qualitative results on the FreiHAND benchmark dataset demonstrate that our approach achieves state-of-the-art performance.

引用

页码：79 / 96

页数：18

共 59 条

[1] Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering
Baek, Seungryul
Kim, Kwang In
Kim, Tae-Kyun
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1067 - 1076
[2] 3D Hand Shape and Pose from Images in the Wild
Boukhayma, Adnane
de Bem, Rodrigo
Torr, Philip H. S.
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10835 - 10844
[3] Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
Cai, Yujun
Ge, Liuhao
Cai, Jianfei
Yuan, Junsong
[J]. COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 678 - 694
[4] I2UV-HandNet: Image-to-UV Prediction Network for Accurate and High-fidelity 3D Hand Mesh Modeling
Chen, Ping
Chen, Yujin
Yang, Dong
Wu, Fangyin
Li, Qin
Xia, Qingpei
Tan, Yong
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12909 - 12918
[5] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration
Chen, Xingyu
Liu, Yufeng
Ma, Chongyang
Chang, Jianlong
Wang, Huayan
Chen, Tian
Guo, Xiaoyan
Wan, Pengfei
Zheng, Wen
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13269 - 13278
[6] Model-based 3D Hand Reconstruction via Self-Supervised Learning
Chen, Yujin
Tu, Zhigang
Kang, Di
Bao, Linchao
Zhang, Ying
Zhe, Xuefei
Chen, Ruizhi
Yuan, Junsong
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10446 - 10455
[7] Choi Hongsuk, 2020, Computer Vision ECCV 2020
[8] HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation
Doosti, Bardia
Naha, Shujon
Mirbagheri, Majid
Crandall, David J.
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6607 - 6616
[9] 3D interacting hand pose and shape estimation from a single RGB image
Gao, Chengying
Yang, Yujia
Li, Wensheng
[J]. NEUROCOMPUTING, 2022, 474 : 25 - 36
[10] 3D Hand Shape and Pose Estimation from a Single RGB Image
Ge, Liuhao
Ren, Zhou
Li, Yuncheng
Xue, Zehao
Wang, Yingying
Cai, Jianfei
Yuan, Junsong
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10825 - 10834

← 1 2 3 4 5 6 →