HQ3DAvatar: High-quality Implicit 3D Head Avatar

被引:1
作者
Teotia, Kartik [1 ,2 ]
Mallikarjun, B. R. [1 ,2 ]
Pan, Xingang [1 ,3 ]
Kim, Hyeongwoo [4 ]
Garrido, Pablo [5 ]
Elgharib, Mohamed [1 ]
Theobalt, Christian [1 ,2 ]
机构
[1] Max Planck Inst Informat, Saarbrucken, Germany
[2] Univ Saarland, Saarbrucken, Germany
[3] Nanyang Technol Univ, Singapore, Singapore
[4] Imperial Coll London, London, England
[5] Flawless AI, Los Angeles, CA USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 03期
关键词
Volumetric rendering; implicit representations; Neural Radiance Fields; neural avatars; free-viewpoint; rendering;
D O I
10.1145/3649889
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This article presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high quality, faster training, and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow-based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and showfree-viewpoint renderings at interactive real-time rates for a resolution of 480x270. Our method outperforms related approaches both visually and numerically. We will release our multiple-identity dataset to encourage further research.
引用
收藏
页数:24
相关论文
共 83 条
  • [31] Neural 3D Video Synthesis from Multi-view Video
    Li, Tianye
    Slavcheva, Mira
    Zollhoefer, Michael
    Green, Simon
    Lassner, Christoph
    Kim, Changil
    Schmidt, Tanner
    Lovegrove, Steven
    Goesele, Michael
    Newcombe, Richard
    Lv, Zhaoyang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5511 - 5521
  • [32] Learning a model of facial shape and expression from 4D scans
    Li, Tianye
    Bolkart, Timo
    Black, Michael J.
    Li, Hao
    Romero, Javier
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (06):
  • [33] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
    Li, Zhengqi
    Niklaus, Simon
    Snavely, Noah
    Wang, Oliver
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6494 - 6504
  • [34] Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
    Lin, Jiangke
    Yuan, Yi
    Shao, Tianjia
    Zhou, Kun
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5890 - 5899
  • [35] Real-Time High-Resolution Background Matting
    Lin, Shanchuan
    Ryabtsev, Andrey
    Sengupta, Soumyadip
    Curless, Brian
    Seitz, Steve
    Kemelmacher-Shlizerman, Ira
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8758 - 8767
  • [36] Lombardi S, 2021, ACM T GRAPHIC, V40, DOI [10.1145/3450626.3459863, 10.1145/3476576.3476608]
  • [37] Neural Volumes: Learning Dynamic Renderable Volumes from images
    Lombardi, Stephen
    Simon, Tomas
    Saragih, Jason
    Schwartz, Gabriel
    Lehrmann, Andreas
    Sheikh, Yaser
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04):
  • [38] Deep Appearance Models for Face Rendering
    Lombardi, Stephen
    Saragih, Jason
    Simon, Tomas
    Sheikh, Yaser
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
  • [39] Pixel Codec Avatars
    Ma, Shugao
    Simon, Tomas
    Saragih, Jason
    Wang, Dawei
    Li, Yuecheng
    De La Torre, Fernando
    Sheikh, Yaser
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 64 - 73
  • [40] Learning Complete 3D Morphable Face Models from Images and Videos
    Mallikarjun, B. R.
    Tewari, Ayush
    Seidel, Hans-Peter
    Elgharib, Mohamed
    Theobalt, Christian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3360 - 3370