paGAN: Real-time Avatars Using Dynamic Textures

被引：0

作者：

Nagano, Koki ^{[1
,2
]}

Seo, Jaewoo ^{[1
]}

Xing, Jun ^{[2
]}

Wei, Lingyu ^{[1
]}

Li, Zimo ^{[3
]}

Saito, Shunsuke ^{[1
,3
]}

Agarwal, Aviral ^{[1
]}

Fursund, Jens ^{[1
]}

Li, Hao ^{[1
,2
,3
]}

机构：

[1] Pinscreen, Santa Monica, CA 90401 USA

[2] USC Inst Creat Technol, Los Angeles, CA 90094 USA

[3] Univ Southern Calif, Los Angeles, CA USA

来源：

SIGGRAPH ASIA'18: SIGGRAPH ASIA 2018 TECHNICAL PAPERS | 2018年

关键词：

Digital avatar; Texture synthesis; Image-based rendering; Generative adversarial network; Facial animation; DATABASE; FACES;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

With the rising interest in personalized VR and gaming experiences comes the need to create high quality 3D avatars that are both low-cost and variegated. Due to this, building dynamic avatars from a single unconstrained input image is becoming a popular application. While previous techniques that attempt this require multiple input images or rely on transferring dynamic facial appearance from a source actor, we are able to do so using only one 2D input image without any form of transfer from a source image. We achieve this using a new conditional Generative Adversarial Network design that allows fine-scale manipulation of any facial input image into a new expression while preserving its identity. Our photoreal avatar GAN (paGAN) can also synthesize the unseen mouth interior and control the eye-gaze direction of the output, as well as produce the final image from a novel viewpoint. The method is even capable of generating fully-controllable temporally stable video sequences, despite not using temporal information during training. After training, we can use our network to produce dynamic image-based avatars that are controllable on mobile devices in real time. To do this, we compute a fixed set of output images that correspond to key blendshapes, from which we extract textures in UV space. Using a subject's expression blendshapes at run-time, we can linearly blend these key textures together to achieve the desired appearance. Furthermore, we can use the mouth interior and eye textures produced by our network to synthesize on-the-fly avatar animations for those regions. Our work produces state-of-the-art quality image and video synthesis, and is the first to our knowledge that is able to generate a dynamically textured avatar with a mouth interior, all from a single image.

引用

页数：12

共 49 条

[1]

Alexander Oleg, 2013, ACM SIGGRAPH 2013 PO, P1

[2]

Amberg B, 2008, INT C AUTOMATIC FACE, P1

[3]

[Anonymous], 2017, ARXIV171203474

[4]

[Anonymous], 2017, ARXIV170903842

[5]

[Anonymous], 2015, ARXIV151102683

[6]

[Anonymous], ECCV

[7]

[Anonymous], 2017, arXiv

[8]

[Anonymous], PRACTICAL APPEARANCE

[9]

[Anonymous], 2017, P IEEE C COMP VIS PA

[10]

[Anonymous], 2015, arXiv

← 1 2 3 4 5 →