ShapeGraFormer: GraFormer-Based Network for Hand-Object Reconstruction From a Single Depth Map

被引:0
|
作者
Aboukhadra, Ahmed Tawfik [1 ,2 ]
Malik, Jameel [1 ,3 ]
Robertini, Nadia [1 ]
Elhayek, Ahmed [4 ]
Stricker, Didier [1 ,2 ]
机构
[1] German Res Ctr Artificial Intelligence DFKI, Augmented Vis Grp, D-67663 Kaiserslautern, Germany
[2] Univ Kaiserslautern Landau RPTU, Dept Comp Sci, D-67663 Kaiserslautern, Germany
[3] Natl Univ Sci & Technol NUST, Sch Elect Engn & Comp Sci SEECS, Islamabad 44000, Pakistan
[4] Univ Prince Mugrin UPM, Coll Comp & Cyber Sci, Artificial Intelligence Dept, Madinah 42241, Saudi Arabia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Three-dimensional displays; Image reconstruction; Feature extraction; Heating systems; Solid modeling; Computer vision; Convolutional neural networks; Transformers; Human activity recognition; deep learning; graph convolutional network; hand-object 3D reconstruction; pose estimation;
D O I
10.1109/ACCESS.2024.3445993
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D reconstruction of hand-object manipulations is important for emulating human actions. Most methods dealing with challenging object manipulation scenarios focus on hands reconstruction in isolation, ignoring physical and kinematic constraints due to object contact. Some approaches produce more realistic results by jointly reconstructing 3D hand-object interactions. However, they focus on coarse pose estimation or rely upon known hand and object shapes. We propose an approach for realistic 3D hand-object shape and pose reconstruction from a single depth map. Unlike previous work, our voxel-based reconstruction network regresses the vertex coordinates of a hand and an object and reconstructs more realistic interaction. Our pipeline additionally predicts voxelized hand-object shapes, having a one-to-one mapping to the input voxelized depth. Thereafter, we exploit the graph nature of the hand and object shapes, by utilizing the recent GraFormer network with positional embedding to reconstruct shapes from template meshes. In addition, we show the impact of adding another GraFormer component that refines the reconstructed shapes based on the hand-object interactions and its ability to reconstruct more accurate object shapes. From those contributions, we name our system ShapeGraFormer. We perform an extensive evaluation on the HO-3D and DexYCB datasets and show that our method outperforms existing approaches in hand reconstruction and produces plausible reconstructions for the objects.
引用
收藏
页码:124021 / 124031
页数:11
相关论文
共 22 条
  • [1] Graph-Based Hand-Object Meshes and Poses Reconstruction With Multi-Modal Input
    Almadani, Murad
    Elhayek, Ahmed
    Malik, Jameel
    Stricker, Didier
    IEEE ACCESS, 2021, 9 : 136438 - 136447
  • [2] Joint Hand-Object 3D Reconstruction From a Single Image With Cross-Branch Feature Fusion
    Chen, Yujin
    Tu, Zhigang
    Kang, Di
    Chen, Ruizhi
    Bao, Linchao
    Zhang, Zhengyou
    Yuan, Junsong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4008 - 4021
  • [3] SHOWMe: Robust object-agnostic hand-object 3D reconstruction from RGB video
    Swamy, Anilkumar
    Leroy, Vincent
    Weinzaepfel, Philippe
    Baradel, Fabien
    Galaaoui, Salma
    Bregier, Romain
    Armando, Matthieu
    Franco, Jean-Sebastien
    Rogez, Gregory
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 247
  • [4] Shape Reconstruction of Object-Level Building From Single Image Based on Implicit Representation Network
    Zhao, Chunhui
    Zhang, Chi
    Yan, Yiming
    Su, Nan
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [5] 3D Map Reconstruction From Single Satellite Image Using a Deep Monocular Depth Network
    Son, Changmin
    Park, Soon-Yong
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 5 - 7
  • [6] An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction
    Dalai, Radhamadhab
    Dalai, Nibedita
    Senapati, Kishore Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (18) : 28235 - 28258
  • [7] An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction
    Radhamadhab Dalai
    Nibedita Dalai
    Kishore Kumar Senapati
    Multimedia Tools and Applications, 2023, 82 : 28235 - 28258
  • [8] Promising Depth Map Prediction Method from a Single Image Based on Conditional Generative Adversarial Network
    Abdulwahab, Saddam
    Rashwan, Hatem A.
    Masoumian, Armin
    Sharaf, Najwa
    Puig, Domenec
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2021, 339 : 392 - 401
  • [9] PP-Net: Simultaneous Pose and Shape Reconstruction from a Single Depth Map
    Zhao, Zimeng
    Zhang, Kanjian
    Wang, Yangang
    OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VII, 2020, 11550
  • [10] Template-based Hand Shape Recovery from a Single Depth Image
    Fan, Qing
    Shen, Xukun
    Tang, Bowen
    Lyu, Geng
    2019 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV), 2019, : 18 - 23