MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar

被引:8
作者
Chen, Yufan [1 ,2 ]
Wang, Lizhen [2 ]
Li, Qijing [2 ]
Xiao, Hongjiang [3 ]
Zhang, Shengping [1 ]
Yao, Hongxun [1 ]
Liu, Yebin [2 ]
机构
[1] Harbin Inst Technol, Harbin, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Commun Univ China, Beijing, Peoples R China
来源
PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS | 2024年
基金
中国国家自然科学基金;
关键词
Facial Reenactment; Deep Learning;
D O I
10.1145/3641519.3657499
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ability to animate photo-realistic head avatars reconstructed from monocular portrait video sequences represents a crucial step in bridging the gap between the virtual and real worlds. Recent advancements in head avatar techniques, including explicit 3D morphable meshes (3DMM), point clouds, and neural implicit representation have been exploited for this ongoing research. However, 3DMM-based methods are constrained by their fixed topologies, point-based approaches suffer from a heavy training burden due to the extensive quantity of points involved, and the last ones suffer from limitations in deformation flexibility and rendering efficiency. In response to these challenges, we propose Mono-GaussianAvatar (Monocular Gaussian Point-based Head Avatar), a novel approach that harnesses 3D Gaussian point representation coupled with a Gaussian deformation field to learn explicit head avatars from monocular portrait videos. We define our head avatars with Gaussian points characterized by adaptable shapes, enabling flexible topology. These points exhibit movement with a Gaussian deformation field in alignment with the target pose and expression of a person, facilitating efficient deformation. Additionally, the Gaussian points have controllable shape, size, color, and opacity combined with Gaussian splatting, allowing for efficient training and rendering. Experiments demonstrate the superior performance of our method, which achieves state-of-the-art results among previous methods. Code and data can be found at https://github.com/aipixel/MonoGaussianAvatar.
引用
收藏
页数:9
相关论文
共 55 条
[1]   RigNeRF: Fully Controllable Neural 3D Portraits [J].
Athar, ShahRukh ;
Xu, Zexiang ;
Sunkavalli, Kalyan ;
Shechtman, Eli ;
Shu, Zhixin .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20332-20341
[2]  
Athar ShahRukh, 2023, IEEE 17 INT C AUT FA, P1
[3]   Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos [J].
Bai, Ziqian ;
Tan, Feitong ;
Huang, Zeng ;
Sarkar, Kripasindhu ;
Tang, Danhang ;
Qiu, Di ;
Meka, Abhimitra ;
Du, Ruofei ;
Dou, Mingsong ;
Orts-Escolano, Sergio ;
Pandey, Rohit ;
Tan, Ping ;
Beeler, Thabo ;
Fanello, Sean ;
Zhang, Yinda .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :16890-16900
[4]  
Barron J. T., 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P2521, DOI 10.1109/CVPR.2011.5995392
[5]   High-Quality Single-Shot Capture of Facial Geometry [J].
Beeler, Thabo ;
Bickel, Bernd ;
Beardsley, Paul ;
Sumner, Bob ;
Gross, Markus .
ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (04)
[6]  
Bergman Alexander, 2022, Advances in Neural Information Processing Systems, V35, P19900
[7]   FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [J].
Bharadwaj, Shrisha ;
Zheng, Yufeng ;
Hilliges, Otmar ;
Black, Michael J. ;
Abrevaya, Victoria Fernandez .
ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06)
[8]   A morphable model for the synthesis of 3D faces [J].
Blanz, V ;
Vetter, T .
SIGGRAPH 99 CONFERENCE PROCEEDINGS, 1999, :187-194
[9]   High Resolution Passive Facial Performance Capture [J].
Bradley, Derek ;
Heidrich, Wolfgang ;
Popa, Tiberiu ;
Sheffer, Alla .
ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (04)
[10]   Learning Implicit Fields for Generative Shape Modeling [J].
Chen, Zhiqin ;
Zhang, Hao .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5932-5941