Model-based 3D Hand Reconstruction via Self-Supervised Learning

被引：61

作者：

Chen, Yujin ^{[1
,2
]}

Tu, Zhigang ^{[1
]}

Kang, Di ^{[2
]}

Bao, Linchao ^{[2
]}

Zhang, Ying ^{[3
]}

Zhe, Xuefei ^{[2
]}

Chen, Ruizhi ^{[1
]}

Yuan, Junsong ^{[4
]}

机构：

[1] Wuhan Univ, Wuhan, Hubei, Peoples R China

[2] Tencent AI Lab, Bellevue, WA USA

[3] Tencent, Shenzhen, Guangdong, Peoples R China

[4] SUNY Buffalo, Buffalo, NY USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR46437.2021.01031

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity. To reliably reconstruct a 3D hand from a monocular image, most state-of-the-art methods heavily rely on 3D annotations at the training stage, but obtaining 3D annotations is expensive. To alleviate reliance on labeled training data, we propose S2HAND, a self-supervised 3D hand reconstruction network that can jointly estimate pose, shape, texture, and the camera viewpoint. Specifically, we obtain geometric cues from the input image through easily accessible 2D detected keypoints. To learn an accurate hand reconstruction model from these noisy geometric cues, we utilize the consistency between 2D and 3D representations and propose a set of novel losses to rationalize outputs of the neural network. For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations. Our experiments show that the proposed self-supervised method achieves comparable performance with recent fully-supervised methods. The code is available at https://github.com/TerenceCYJ/S2HAND.

引用

页码：10446 / 10455

页数：10

共 53 条

[1]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01110

[2] Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation Under Hand-Object Interaction [J].

Armagan, Anil ;

Garcia-Hernando, Guillermo ;

Baek, Seungryul ;

Hampali, Shreyas ;

Rad, Mahdi ;

Zhang, Zhaohui ;

Xie, Shipeng ;

Chen, MingXiu ;

Zhang, Boshen ;

Xiong, Fu ;

Xiao, Yang ;

Cao, Zhiguo ;

Yuan, Junsong ;

Ren, Pengfei ;

Huang, Weiting ;

Sun, Haifeng ;

Hruz, Marek ;

Kanis, Jakub ;

Krnoul, Zdenek ;

Wan, Qingfu ;

Li, Shile ;

Yang, Linlin ;

Lee, Dongheui ;

Yao, Angela ;

Zhou, Weiguo ;

Mei, Sijia ;

Liu, Yunhui ;

Spurr, Adrian ;

Iqbal, Umar ;

Molchanov, Pavlo ;

Weinzaepfel, Philippe ;

Bregier, Romain ;

Rogez, Gregory ;

Lepetit, Vincent ;

Kim, Tae-Kyun .

COMPUTER VISION - ECCV 2020, PT XXIII, 2020, 12368 :85-101

[3]

Athitsos V, 2003, PROC CVPR IEEE, P432

[4] Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering [J].

Baek, Seungryul ;

Kim, Kwang In ;

Kim, Tae-Kyun .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1067-1076

[5]

Baek Seungryul, 2020, C COMP VIS PATT REC

[6]

Ballan L, 2012, LECT NOTES COMPUT SC, V7577, P640, DOI 10.1007/978-3-642-33783-3_46

[7] A morphable model for the synthesis of 3D faces [J].

Blanz, V ;

Vetter, T .

SIGGRAPH 99 CONFERENCE PROCEEDINGS, 1999, :187-194

[8] Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images [J].

Cai, Yujun ;

Ge, Liuhao ;

Cai, Jianfei ;

Yuan, Junsong .

COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 :678-694

[9] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].

Cao, Zhe ;

Hidalgo, Gines ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186

[10] Self-Supervised Learning of Detailed 3D Face Reconstruction [J].

Chen, Yajing ;

Wu, Fanzi ;

Wang, Zeyu ;

Song, Yibing ;

Ling, Yonggen ;

Bao, Linchao .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :8696-8705

← 1 2 3 4 5 6 →