Model-based 3D Hand Reconstruction via Self-Supervised Learning

被引:52
作者
Chen, Yujin [1 ,2 ]
Tu, Zhigang [1 ]
Kang, Di [2 ]
Bao, Linchao [2 ]
Zhang, Ying [3 ]
Zhe, Xuefei [2 ]
Chen, Ruizhi [1 ]
Yuan, Junsong [4 ]
机构
[1] Wuhan Univ, Wuhan, Hubei, Peoples R China
[2] Tencent AI Lab, Bellevue, WA USA
[3] Tencent, Shenzhen, Guangdong, Peoples R China
[4] SUNY Buffalo, Buffalo, NY USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR46437.2021.01031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity. To reliably reconstruct a 3D hand from a monocular image, most state-of-the-art methods heavily rely on 3D annotations at the training stage, but obtaining 3D annotations is expensive. To alleviate reliance on labeled training data, we propose S2HAND, a self-supervised 3D hand reconstruction network that can jointly estimate pose, shape, texture, and the camera viewpoint. Specifically, we obtain geometric cues from the input image through easily accessible 2D detected keypoints. To learn an accurate hand reconstruction model from these noisy geometric cues, we utilize the consistency between 2D and 3D representations and propose a set of novel losses to rationalize outputs of the neural network. For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations. Our experiments show that the proposed self-supervised method achieves comparable performance with recent fully-supervised methods. The code is available at https://github.com/TerenceCYJ/S2HAND.
引用
收藏
页码:10446 / 10455
页数:10
相关论文
共 53 条
  • [1] [Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01110
  • [2] Armagan Anil, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P85, DOI 10.1007/978-3-030-58592-1_6
  • [3] Athitsos V, 2003, PROC CVPR IEEE, P432
  • [4] Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering
    Baek, Seungryul
    Kim, Kwang In
    Kim, Tae-Kyun
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1067 - 1076
  • [5] Baek Seungryul, 2020, C COMP VIS PATT REC
  • [6] Ballan L, 2012, LECT NOTES COMPUT SC, V7577, P640, DOI 10.1007/978-3-642-33783-3_46
  • [7] A morphable model for the synthesis of 3D faces
    Blanz, V
    Vetter, T
    [J]. SIGGRAPH 99 CONFERENCE PROCEEDINGS, 1999, : 187 - 194
  • [8] Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
    Cai, Yujun
    Ge, Liuhao
    Cai, Jianfei
    Yuan, Junsong
    [J]. COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 678 - 694
  • [9] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
  • [10] Self-Supervised Learning of Detailed 3D Face Reconstruction
    Chen, Yajing
    Wu, Fanzi
    Wang, Zeyu
    Song, Yibing
    Ling, Yonggen
    Bao, Linchao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8696 - 8705