Authentic Volumetric Avatars from a Phone Scan

被引：84

作者：

Cao, Chen ^{[1
]}

Simon, Tomas ^{[1
]}

Kim, Jin Kyu ^{[1
]}

Schwartz, Gabe ^{[1
]}

Zollhoefer, Michael ^{[1
]}

Saito, Shun-Suke ^{[1
]}

Lombardi, Stephen ^{[1
]}

Wei, Shih-En ^{[1
]}

Belko, Danielle ^{[1
]}

Yu, Shoou-, I ^{[1
]}

Sheikh, Yaser ^{[1
]}

Saragih, Jason ^{[1
]}

机构：

[1] Real Labs, 131 15th St, Pittsburgh, PA 15222 USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2022年 / 41卷 / 04期

关键词：

3D Avatar Creation; Neural Rendering; OF-THE-ART; 3D;

D O I：

10.1145/3528223.3530143

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Creating photorealistic avatars of existing people currently requires extensive person-specific data capture, which is usually only accessible to the VFX industry and not the general public. Our work aims to address this drawback by relying only on a short mobile phone capture to obtain a drivable 3D head avatar that matches a person's likeness faithfully. In contrast to existing approaches, our architecture avoids the complex task of directly modeling the entire manifold of human appearance, aiming instead to generate an avatar model that can be specialized to novel identities using only small amounts of data. The model dispenses with low-dimensional latent spaces that are commonly employed for hallucinating novel identities, and instead, uses a conditional representation that can extract person-specific information at multiple scales from a high resolution registered neutral phone scan. We achieve high quality results through the use of a novel universal avatar prior that has been trained on high resolution multi-view video captures of facial performances of hundreds of human subjects. By fine-tuning the model using inverse rendering we achieve increased realism and personalize its range of motion. The output of our approach is not only a high-fidelity 3D head avatar that matches the person's facial shape and appearance, but one that can also be driven using a jointly discovered shared global expression space with disentangled controls for gaze direction. Via a series of experiments we demonstrate that our avatars are faithful representations of the subject's likeness. Compared to other state-of-the-art methods for lightweight avatar creation, our approach exhibits superior visual quality and animateability.

引用

页数：19

共 88 条

[31]

Huang Xiaolei, 2004, P 2004 IEEE C COMPUT

[32] Dynamic 3D Avatar Creation from Hand-held Video Input [J].

Ichim, Alexandru Eugen ;

Bouaziz, Sofien ;

Pauly, Mark .

ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04)

[33] Image-to-Image Translation with Conditional Adversarial Networks [J].

Isola, Phillip ;

Zhu, Jun-Yan ;

Zhou, Tinghui ;

Efros, Alexei A. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5967-5976

[34] Perceptual Losses for Real-Time Style Transfer and Super-Resolution [J].

Johnson, Justin ;

Alahi, Alexandre ;

Li Fei-Fei .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :694-711

[35] Analyzing and Improving the Image Quality of StyleGAN [J].

Karras, Tero ;

Laine, Samuli ;

Aittala, Miika ;

Hellsten, Janne ;

Lehtinen, Jaakko ;

Aila, Timo .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8107-8116

[36] A Style-Based Generator Architecture for Generative Adversarial Networks [J].

Karras, Tero ;

Laine, Samuli ;

Aila, Timo .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) :4217-4228

[37]

Karras T, 2021, Arxiv, DOI [arXiv:2106.12423, DOI 10.48550/ARXIV.2106.12423]

[38] Internet-based Morphable Model [J].

Kemelmacher-Shlizerman, Ira .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :3256-3263

[39] Deep Video Portraits [J].

Kim, Hyeongwoo ;

Garrido, Pablo ;

Tewari, Ayush ;

Xu, Weipeng ;

Thies, Justus ;

Niessner, Matthias ;

Perez, Patrick ;

Richardt, Christian ;

Zollhofer, Michael ;

Theobalt, Christian .

ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04)

[40]

Kingma DP, 2014, ADV NEUR IN, V27

← 1 2 3 4 5 6 7 8 9 →