HQ3DAvatar: High-quality Implicit 3D Head Avatar

被引：1

作者：

Teotia, Kartik ^{[1
,2
]}

Mallikarjun, B. R. ^{[1
,2
]}

Pan, Xingang ^{[1
,3
]}

Kim, Hyeongwoo ^{[4
]}

Garrido, Pablo ^{[5
]}

Elgharib, Mohamed ^{[1
]}

Theobalt, Christian ^{[1
,2
]}

机构：

[1] Max Planck Inst Informat, Saarbrucken, Germany

[2] Univ Saarland, Saarbrucken, Germany

[3] Nanyang Technol Univ, Singapore, Singapore

[4] Imperial Coll London, London, England

[5] Flawless AI, Los Angeles, CA USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 03期

关键词：

Volumetric rendering; implicit representations; Neural Radiance Fields; neural avatars; free-viewpoint; rendering;

D O I：

10.1145/3649889

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This article presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high quality, faster training, and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow-based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and showfree-viewpoint renderings at interactive real-time rates for a resolution of 480x270. Our method outperforms related approaches both visually and numerically. We will release our multiple-identity dataset to encourage further research.

引用

页数：24

共 83 条

[31] Neural 3D Video Synthesis from Multi-view Video
Li, Tianye
Slavcheva, Mira
Zollhoefer, Michael
Green, Simon
Lassner, Christoph
Kim, Changil
Schmidt, Tanner
Lovegrove, Steven
Goesele, Michael
Newcombe, Richard
Lv, Zhaoyang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5511 - 5521
[32] Learning a model of facial shape and expression from 4D scans
Li, Tianye
Bolkart, Timo
Black, Michael J.
Li, Hao
Romero, Javier
[J]. ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (06):
[33] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
Li, Zhengqi
Niklaus, Simon
Snavely, Noah
Wang, Oliver
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6494 - 6504
[34] Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
Lin, Jiangke
Yuan, Yi
Shao, Tianjia
Zhou, Kun
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5890 - 5899
[35] Real-Time High-Resolution Background Matting
Lin, Shanchuan
Ryabtsev, Andrey
Sengupta, Soumyadip
Curless, Brian
Seitz, Steve
Kemelmacher-Shlizerman, Ira
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8758 - 8767
[36] Lombardi S, 2021, ACM T GRAPHIC, V40, DOI [10.1145/3450626.3459863, 10.1145/3476576.3476608]
[37] Neural Volumes: Learning Dynamic Renderable Volumes from images
Lombardi, Stephen
Simon, Tomas
Saragih, Jason
Schwartz, Gabriel
Lehrmann, Andreas
Sheikh, Yaser
[J]. ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04):
[38] Deep Appearance Models for Face Rendering
Lombardi, Stephen
Saragih, Jason
Simon, Tomas
Sheikh, Yaser
[J]. ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
[39] Pixel Codec Avatars
Ma, Shugao
Simon, Tomas
Saragih, Jason
Wang, Dawei
Li, Yuecheng
De La Torre, Fernando
Sheikh, Yaser
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 64 - 73
[40] Learning Complete 3D Morphable Face Models from Images and Videos
Mallikarjun, B. R.
Tewari, Ayush
Seidel, Hans-Peter
Elgharib, Mohamed
Theobalt, Christian
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3360 - 3370

← 1 2 3 4 5 6 7 8 9 →