Learning an Animatable Detailed 3D Face Model from In-The-Wild Images

被引:354
作者
Feng, Yao [1 ,2 ]
Feng, Haiwen [1 ]
Black, Michael J. [1 ]
Bolkart, Timo [1 ]
机构
[1] Max Planck Inst Intelligent Syst, Tubingen, Germany
[2] Max Planck ETH Ctr Learning Syst, Tubingen, Germany
来源
ACM TRANSACTIONS ON GRAPHICS | 2021年 / 40卷 / 04期
关键词
Detailed face model; 3D face reconstruction; facial animation; detail disentanglement; MORPHABLE MODEL; SHAPE;
D O I
10.1145/3450626.3459936
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
While current monocular 3D face reconstruction methods can recover fine geometric details, they suffer several limitations. Some methods produce faces that cannot be realistically animated because they do not model how wrinkles vary with expression. Other methods are trained on high-quality face scans and do not generalize well to in-the-wild images. We present the first approach that regresses 3D face shape and animatable details that are specific to an individual but change with expression. Our model, DECA (Detailed Expression Capture and Animation), is trained to robustly produce a UV displacement map from a low-dimensional latent representation that consists of person-specific detail parameters and generic expression parameters, while a regressor is trained to predict detail, shape, albedo, expression, pose and illumination parameters from a single image. To enable this, we introduce a novel detail-consistency loss that disentangles person-specific details from expression-dependent wrinkles. This disentanglement allows us to synthesize realistic person-specific wrinkles by controlling expression parameters while keeping person-specific details unchanged. DECA is learned from in-the-wild images with no paired 3D supervision and achieves state-of-the-art shape reconstruction accuracy on two benchmarks. Qualitative results on in-the-wild data demonstrate DECA's robustness and its ability to disentangle identity- and expression-dependent details enabling animation of reconstructed faces. The model and code are publicly available at https://deca.is.tue.mpg.de.
引用
收藏
页数:13
相关论文
共 103 条
[1]   Cross-modal Deep Face Normals with Deactivable Skip Connections [J].
Abrevaya, Victoria Fernandez ;
Boukhayma, Adnane ;
Torr, Philip H. S. ;
Boyer, Edmond .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4978-4988
[2]   Inverse Rendering of Faces with a 3D Morphable Model [J].
Aldrian, Oswald ;
Smith, William A. P. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (05) :1080-1093
[3]   Extreme 3D Face Reconstruction: Seeing Through Occlusions [J].
Anh Tuan Tran ;
Hassner, Tal ;
Masi, Iacopo ;
Paz, Eran ;
Nirkin, Yuval ;
Medioni, Gerard .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3935-3944
[4]   Fitting a 3D Morphable Model to Edges: A Comparison Between Hard and Soft Correspondences [J].
Bas, Anil ;
Smith, William A. P. ;
Bolkart, Timo ;
Wuhrer, Stefanie .
COMPUTER VISION - ACCV 2016 WORKSHOPS, PT II, 2017, 10117 :377-391
[5]   High-Quality Single-Shot Capture of Facial Geometry [J].
Beeler, Thabo ;
Bickel, Bernd ;
Beardsley, Paul ;
Sumner, Bob ;
Gross, Markus .
ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (04)
[6]   A morphable model for the synthesis of 3D faces [J].
Blanz, V ;
Vetter, T .
SIGGRAPH 99 CONFERENCE PROCEEDINGS, 1999, :187-194
[7]   Face identification across different poses and illuminations with a 3D morphable model [J].
Blanz, V ;
Romdhani, S ;
Vetter, T .
FIFTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, 2002, :202-207
[8]   Review of statistical shape spaces for 3D data with comparative analysis for human faces [J].
Brunton, Alan ;
Salazar, Augusto ;
Bolkart, Timo ;
Wuhrer, Stefanie .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 128 :1-17
[9]   How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks) [J].
Bulat, Adrian ;
Tzimiropoulos, Georgios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1021-1030
[10]   Real-Time High-Fidelity Facial Performance Capture [J].
Cao, Chen ;
Bradley, Derek ;
Zhou, Kun ;
Beeler, Thabo .
ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04)