MPS-NeRF: Generalizable 3D Human Rendering From Multiview Images

被引:1
作者
Gao X. [1 ]
Yang J. [2 ]
Kim J. [2 ]
Peng S. [3 ]
Liu Z. [4 ]
Tong X. [2 ]
机构
[1] Beijing Institute of Technology, Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing
[2] Microsoft Research Asia, Beijing
[3] Zhejiang University, College of Computer Science and Technology, Zhejiang, Hangzhou
[4] Microsoft Azure AI, Redmond, 98004, WA
关键词
human synthesis; neural radiance field; Neural rendering;
D O I
10.1109/TPAMI.2022.3205910
中图分类号
学科分类号
摘要
There has been rapid progress recently on 3D human rendering, including novel view synthesis and pose animation, based on the advances of neural radiance fields (NeRF). However, most existing methods focus on person-specific training and their training typically requires multi-view videos. This article deals with a new challenging task – rendering novel views and novel poses for a person unseen in training, using only multiview still images as input without videos. For this task, we propose a simple yet surprisingly effective method to train a generalizable NeRF with multiview images as conditional input. The key ingredient is a dedicated representation combining a canonical NeRF and a volume deformation scheme. Using a canonical space enables our method to learn shared properties of human and easily generalize to different people. Volume deformation is used to connect the canonical space with input and target images and query image features for radiance and density prediction. We leverage the parametric 3D human model fitted on the input images to derive the deformation, which works quite well in practice when combined with our canonical NeRF. The experiments on both real and synthetic data with the novel view synthesis and pose animation tasks collectively demonstrate the efficacy of our method. © 1979-2012 IEEE.
引用
收藏
页码:6110 / 6121
页数:11
相关论文
共 60 条
[1]  
Collet A., Et al., High-quality streamable free-viewpoint video, ACM Trans. Graph., 34, 4, pp. 1-13, (2015)
[2]  
Orts-Escolano S., Et al., Holoportation: Virtual 3D teleportation in real-time, Proc. Annu. Symp. User Interface Softw. Technol., pp. 741-754, (2016)
[3]  
Martin-Brualla R., Et al., Lookingood: Enhancing performance capture with real-time neural re-rendering, ACM Trans. Graph., 37, 6, pp. 1-14, (2018)
[4]  
Wu M., Wang Y., Hu Q., Yu J., Multi-view neural human rendering, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 1682-1691, (2020)
[5]  
Peng S., Et al., Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 9054-9063, (2021)
[6]  
Peng S., Et al., Animatable neural radiance fields for modeling dynamic human bodies, Proc. IEEE/CVF Int. Conf. Comput. Vis., (2021)
[7]  
Noguchi A., Sun X., Lin S., Harada T., Neural articulated radiance field, Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 5762-5772
[8]  
Mildenhall B., Srinivasan P.P., Tancik M., Barron J.T., Rama-Moorthi R., Ng R., NeRF: Representing scenes as neural radiance fields for view synthesis, Proc. Eur. Conf. Comput. Vis., 2020, pp. 405-421
[9]  
Anguelov D., Srinivasan P., Koller D., Thrun S., Rodgers J., Davis J., Scape: Shape completion and animation of people, Proc. ACM SIGGRAPH, pp. 408-416, (2005)
[10]  
Ma Q., Et al., Learning to dress 3D people in generative clothing, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 6469-6478, (2020)