HQ3DAvatar: High-quality Implicit 3D Head Avatar

被引:1
作者
Teotia, Kartik [1 ,2 ]
Mallikarjun, B. R. [1 ,2 ]
Pan, Xingang [1 ,3 ]
Kim, Hyeongwoo [4 ]
Garrido, Pablo [5 ]
Elgharib, Mohamed [1 ]
Theobalt, Christian [1 ,2 ]
机构
[1] Max Planck Inst Informat, Saarbrucken, Germany
[2] Univ Saarland, Saarbrucken, Germany
[3] Nanyang Technol Univ, Singapore, Singapore
[4] Imperial Coll London, London, England
[5] Flawless AI, Los Angeles, CA USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 03期
关键词
Volumetric rendering; implicit representations; Neural Radiance Fields; neural avatars; free-viewpoint; rendering;
D O I
10.1145/3649889
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This article presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high quality, faster training, and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow-based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and showfree-viewpoint renderings at interactive real-time rates for a resolution of 480x270. Our method outperforms related approaches both visually and numerically. We will release our multiple-identity dataset to encourage further research.
引用
收藏
页数:24
相关论文
共 83 条
  • [1] Abdal Rameen, 2023, CoRR abs/2301.02700
  • [2] Amodio Matthew, 2019, arXiv
  • [3] [Anonymous], 2015, BRIT MACHINE VISION
  • [4] RigNeRF: Fully Controllable Neural 3D Portraits
    Athar, ShahRukh
    Xu, Zexiang
    Sunkavalli, Kalyan
    Shechtman, Eli
    Shu, Zhixin
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20332 - 20341
  • [5] Bai YP, 2023, Arxiv, DOI arXiv:2211.15064
  • [6] Bergman Alexander W., 2022, C ADV NEUR INF PROC
  • [7] FLARE: Fast Learning of Animatable and Relightable Mesh Avatars
    Bharadwaj, Shrisha
    Zheng, Yufeng
    Hilliges, Otmar
    Black, Michael J.
    Abrevaya, Victoria Fernandez
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06):
  • [8] Authentic Volumetric Avatars from a Phone Scan
    Cao, Chen
    Simon, Tomas
    Kim, Jin Kyu
    Schwartz, Gabe
    Zollhoefer, Michael
    Saito, Shun-Suke
    Lombardi, Stephen
    Wei, Shih-En
    Belko, Danielle
    Yu, Shoou-, I
    Sheikh, Yaser
    Saragih, Jason
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
  • [9] Real-time 3D Neural Facial Animation from Binocular Video
    Cao, Chen
    Agrawal, Vasu
    De la Torre, Fernando
    Chen, Lele
    Saragih, Jason
    Simon, Tomas
    Sheikh, Yaser
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (04):
  • [10] Real-time Facial Animation with Image-based Dynamic Avatars
    Cao, Chen
    Wu, Hongzhi
    Weng, Yanlin
    Shao, Tianjia
    Zhou, Kun
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (04):