Unsupervised learning of style-aware facial animation from real acting performances

被引:6
作者
Paier, Wolfgang [1 ]
Hilsmann, Anna [1 ]
Eisert, Peter [1 ,2 ]
机构
[1] Fraunhofer Heinrich Hertz Inst, Berlin, Germany
[2] Humboldt Univ, Berlin, Germany
基金
欧盟地平线“2020”;
关键词
Facial animation; Neural rendering; Neural animation; Self-supervised learning; Dynamic textures; VIDEO; MODEL;
D O I
10.1016/j.gmod.2023.101199
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents a novel approach for text/speech-driven animation of a photo-realistic head model based on blend-shape geometry, dynamic textures, and neural rendering. Training a VAE for geometry and texture yields a parametric model for accurate capturing and realistic synthesis of facial expressions from latent feature vector. Our animation method is based on a conditional CNN that transforms text or speech into a sequence of animation parameters. In contrast to previous approaches, our animation model learns disentangling/synthesizing different acting-styles in an unsupervised manner, requiring only phonetic labels that describe the content of training sequences. For realistic real-time rendering, we train a U-Net that refines rasterization-based renderings by computing improved pixel colors and a foreground matte. We compare our framework qualitatively/quantitatively against recent methods for head modeling as well as facial animation and evaluate the perceived rendering/animation quality in a user-study, which indicates large improvements compared to state-of-the-art approaches.
引用
收藏
页数:13
相关论文
共 86 条
[61]   U-Net: Convolutional Networks for Biomedical Image Segmentation [J].
Ronneberger, Olaf ;
Fischer, Philipp ;
Brox, Thomas .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241
[62]  
Siarohin A, 2019, ADV NEUR IN, V32
[63]  
Sitzmann V, 2019, ADV NEUR IN, V32
[64]  
Slossberg R, 2018, Arxiv, DOI arXiv:1808.08281
[65]  
Sosci survey, 2023, About us
[66]   Synthesizing Obama: Learning Lip Sync from Audio [J].
Suwajanakorn, Supasorn ;
Seitz, Steven M. ;
Kemelmacher-Shlizerman, Ira .
ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04)
[67]   State of the Art on Neural Rendering [J].
Tewari, A. ;
Fried, O. ;
Thies, J. ;
Sitzmann, V. ;
Lombardi, S. ;
Sunkavalli, K. ;
Martin-Brualla, R. ;
Simon, T. ;
Saragih, J. ;
Niessner, M. ;
Pandey, R. ;
Fanello, S. ;
Wetzstein, G. ;
Zhu, J. Y. ;
Theobalt, C. ;
Agrawala, M. ;
Shechtman, E. ;
Goldman, D. B. ;
Zollhofer, M. .
COMPUTER GRAPHICS FORUM, 2020, 39 (02) :701-727
[68]   FML: Face Model Learning from Videos [J].
Tewari, Ayush ;
Bernard, Florian ;
Garrido, Pablo ;
Bharaj, Gaurav ;
Elgharib, Mohamed ;
Seidel, Hans-Peter ;
Perez, Patrick ;
Zollhofer, Michael ;
Theobalt, Christian .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10804-10814
[69]   High-Fidelity Monocular Face Reconstruction Based on an Unsupervised Model-Based Face Autoencoder [J].
Tewari, Ayush ;
Zollhofer, Michael ;
Bernard, Florian ;
Garrido, Pablo ;
Kim, Hyeongwoo ;
Perez, Patrick ;
Theobalt, Christian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (02) :357-370
[70]   MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction [J].
Tewari, Ayush ;
Zollhofer, Michael ;
Kim, Hyeongwoo ;
Garrido, Pablo ;
Bernard, Florian ;
Perez, Patrick ;
Theobalt, Christian .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :1274-1283