Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

被引:410
作者
Li, Zhengqi [1 ]
Niklaus, Simon [2 ]
Snavely, Noah [1 ]
Wang, Oliver [2 ]
机构
[1] Cornell Tech, New York, NY 10044 USA
[2] Adobe Res, San Jose, CA USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
INTERPOLATION;
D O I
10.1109/CVPR46437.2021.00643
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input. To do this, we introduce Neural Scene Flow Fields, a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion. Our representation is optimized through a neural network to fit the observed input views. We show that our representation can be used for varieties of in-the-wild scenes, including thin structures, view-dependent effects, and complex degrees of motion. We conduct a number of experiments that demonstrate our approach significantly outperforms recent monocular view synthesis methods, and show qualitative results of space-time view synthesis on a variety of real-world videos.
引用
收藏
页码:6494 / 6504
页数:11
相关论文
共 85 条
[31]   Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation [J].
Jiang, Huaizu ;
Sun, Deqing ;
Jampani, Varun ;
Yang, Ming-Hsuan ;
Learned-Miller, Erik ;
Kautz, Jan .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9000-9008
[32]  
Kauvar I, 2015, ACM T GRAPHIC, V34, DOI [10.1145/2816795.2818070, 10.1145/2682631]
[33]  
Kingma DP, 2014, ADV NEUR IN, V27
[34]   Sampling Based Scene-Space Video Processing [J].
Klose, Felix ;
Wang, Oliver ;
Bazin, Jean-Charles ;
Magnor, Marcus ;
Sorkine-Hornung, Alexander .
ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04)
[35]   Monocular Dense 3D Reconstruction of a Complex Dynamic Scene from Two Perspective Frames [J].
Kumar, Suryansh ;
Dai, Yuchao ;
Li, Hongdong .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4659-4667
[36]  
Levoy M., 1996, Computer Graphics Proceedings. SIGGRAPH '96, P31, DOI 10.1145/237170.237199
[37]   Crowdsampling the Plenoptic Function [J].
Li, Zhengqi ;
Xian, Wenqi ;
Davis, Abe ;
Snavely, Noah .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :178-196
[38]   Learning the Depths of Moving People by Watching Frozen People [J].
Li, Zhengqi ;
Dekel, Tali ;
Cole, Forrester ;
Tucker, Richard ;
Snavely, Noah ;
Liu, Ce ;
Freeman, William T. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4516-4525
[39]   Neural Volumes: Learning Dynamic Renderable Volumes from images [J].
Lombardi, Stephen ;
Simon, Tomas ;
Saragih, Jason ;
Schwartz, Gabriel ;
Lehrmann, Andreas ;
Sheikh, Yaser .
ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)
[40]   Layered Neural Rendering for Retiming People in Video [J].
Lu, Erika ;
Cole, Forrester ;
Dekel, Tali ;
Xie, Weidi ;
Zisserman, Andrew ;
Salesin, David ;
Freeman, William T. ;
Rubinstein, Michael .
ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06)