ShapeGraFormer: GraFormer-Based Network for Hand-Object Reconstruction From a Single Depth Map

被引：0

作者：

Aboukhadra, Ahmed Tawfik ^{[1
,2
]}

Malik, Jameel ^{[1
,3
]}

Robertini, Nadia ^{[1
]}

Elhayek, Ahmed ^{[4
]}

Stricker, Didier ^{[1
,2
]}

机构：

[1] German Res Ctr Artificial Intelligence DFKI, Augmented Vis Grp, D-67663 Kaiserslautern, Germany

[2] Univ Kaiserslautern Landau RPTU, Dept Comp Sci, D-67663 Kaiserslautern, Germany

[3] Natl Univ Sci & Technol NUST, Sch Elect Engn & Comp Sci SEECS, Islamabad 44000, Pakistan

[4] Univ Prince Mugrin UPM, Coll Comp & Cyber Sci, Artificial Intelligence Dept, Madinah 42241, Saudi Arabia

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Three-dimensional displays; Image reconstruction; Feature extraction; Heating systems; Solid modeling; Computer vision; Convolutional neural networks; Transformers; Human activity recognition; deep learning; graph convolutional network; hand-object 3D reconstruction; pose estimation;

D O I：

10.1109/ACCESS.2024.3445993

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

3D reconstruction of hand-object manipulations is important for emulating human actions. Most methods dealing with challenging object manipulation scenarios focus on hands reconstruction in isolation, ignoring physical and kinematic constraints due to object contact. Some approaches produce more realistic results by jointly reconstructing 3D hand-object interactions. However, they focus on coarse pose estimation or rely upon known hand and object shapes. We propose an approach for realistic 3D hand-object shape and pose reconstruction from a single depth map. Unlike previous work, our voxel-based reconstruction network regresses the vertex coordinates of a hand and an object and reconstructs more realistic interaction. Our pipeline additionally predicts voxelized hand-object shapes, having a one-to-one mapping to the input voxelized depth. Thereafter, we exploit the graph nature of the hand and object shapes, by utilizing the recent GraFormer network with positional embedding to reconstruct shapes from template meshes. In addition, we show the impact of adding another GraFormer component that refines the reconstructed shapes based on the hand-object interactions and its ability to reconstruct more accurate object shapes. From those contributions, we name our system ShapeGraFormer. We perform an extensive evaluation on the HO-3D and DexYCB datasets and show that our method outperforms existing approaches in hand reconstruction and produces plausible reconstructions for the objects.

引用

页码：124021 / 124031

页数：11

共 22 条

[21] RETRACTED: Model-based 3D tracking of an articulated hand from single depth images (Retracted article. See vol. 34, pg. 2199, 2013)
Qin, Shuxin
Yang, Yiping
Jiang, Yongshi
PATTERN RECOGNITION LETTERS, 2013, 34 (12) : 1437 - 1445
[22] 3D Object Reconstruction from a Single 2D Image: Performance of Two Novel Frameworks Based on Lightweight CNN Architectures and Free-Form Deformation of Meshes
Pradhan, Saurabh
Madhusudhanan, Kiran
Munoz-Giraldo, Leandro
Faruq, MohiUddin
Jomaa, Hadi
ELEVENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2019), 2020, 11373

← 1 2 3 →