Beyond 3DMM: Learning to Capture High-Fidelity 3D Face Shape

被引:4
作者
Zhu, Xiangyu [1 ,2 ,3 ]
Yu, Chang [1 ,2 ,3 ]
Huang, Di [4 ]
Lei, Zhen [1 ,2 ,3 ,5 ]
Wang, Hao [1 ,2 ,3 ]
Li, Stan Z. [6 ]
机构
[1] Chinese Acad Sci, Ctr Biometr & Secur Res, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[4] Beihang Univ, Key Lab Software Dev Environm, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
[5] Chinese Acad Sci, Ctr Artificial Intelligence & Robot, Hong Kong Inst Sci & Innovat, Hong Kong, Peoples R China
[6] Westlake Univ, Sch Engn, Hangzhou 310024, Zhejiang, Peoples R China
关键词
3D face; face reconstruction; 3DMM; fine-grained; personalized; 3D face dataset; SINGLE IMAGE; RECONSTRUCTION;
D O I
10.1109/TPAMI.2022.3164131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori. However, previous reconstructed 3D faces suffer from degraded visual verisimilitude due to the loss of fine-grained geometry, which is attributed to insufficient ground-truth 3D shapes, unreliable training strategies and limited representation power of 3DMM. To alleviate this issue, this paper proposes a complete solution to capture the personalized shape so that the reconstructed shape looks identical to the corresponding person. Specifically, given a 2D image as the input, we virtually render the image in several calibrated views to normalize pose variations while preserving the original image geometry. A many-to-one hourglass network serves as the encode-decoder to fuse multiview features and generate vertex displacements as the fine-grained geometry. Besides, the neural network is trained by directly optimizing the visual effect, where two 3D shapes are compared by measuring the similarity between the multiview images rendered from the shapes. Finally, we propose to generate the ground-truth 3D shapes by registering RGB-D images followed by pose and shape augmentation, providing sufficient data for network training. Experiments on several challenging protocols demonstrate the superior reconstruction accuracy of our proposal on the face shape.
引用
收藏
页码:1442 / 1457
页数:16
相关论文
共 71 条
[1]  
Amberg B, 2007, IEEE I CONF COMP VIS, P1326
[2]   Extreme 3D Face Reconstruction: Seeing Through Occlusions [J].
Anh Tuan Tran ;
Hassner, Tal ;
Masi, Iacopo ;
Paz, Eran ;
Nirkin, Yuval ;
Medioni, Gerard .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3935-3944
[3]  
Asthana A, 2011, IEEE I CONF COMP VIS, P937, DOI 10.1109/ICCV.2011.6126336
[4]  
Bagdanov A.D., 2011, P ACM MULT INT WORKS
[5]  
Bao LC, 2021, Arxiv, DOI arXiv:2010.05562
[6]   3D Morphable Models as Spatial Transformer Networks [J].
Bas, Anil ;
Huber, Patrik ;
Smith, William A. P. ;
Awais, Muhammad ;
Kittler, Josef .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :895-903
[7]   Fully Automated Facial Expression Recognition Using 3D Morphable Model and Mesh-Local Binary Pattern [J].
Bejaoui, Hela ;
Ghazouani, Haythem ;
Barhoumi, Walid .
ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017), 2017, 10617 :39-50
[8]   Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses [J].
Bhagavatula, Chandrasekhar ;
Zhu, Chenchen ;
Luu, Khoa ;
Savvides, Marios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4000-4009
[9]   Face recognition based on fitting a 3D morphable model [J].
Blanz, V ;
Vetter, T .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (09) :1063-1074
[10]   FaceWarehouse: A 3D Facial Expression Database for Visual Computing [J].
Cao, Chen ;
Weng, Yanlin ;
Zhou, Shun ;
Tong, Yiying ;
Zhou, Kun .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (03) :413-425