Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

被引：410

作者：

Li, Zhengqi ^{[1
]}

Niklaus, Simon ^{[2
]}

Snavely, Noah ^{[1
]}

Wang, Oliver ^{[2
]}

机构：

[1] Cornell Tech, New York, NY 10044 USA

[2] Adobe Res, San Jose, CA USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

INTERPOLATION;

D O I：

10.1109/CVPR46437.2021.00643

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input. To do this, we introduce Neural Scene Flow Fields, a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion. Our representation is optimized through a neural network to fit the observed input views. We show that our representation can be used for varieties of in-the-wild scenes, including thin structures, view-dependent effects, and complex degrees of motion. We conduct a number of experiments that demonstrate our approach significantly outperforms recent monocular view synthesis methods, and show qualitative results of space-time view synthesis on a variety of real-world videos.

引用

页码：6494 / 6504

页数：11

共 85 条

[31] Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation [J].

Jiang, Huaizu ;

Sun, Deqing ;

Jampani, Varun ;

Yang, Ming-Hsuan ;

Learned-Miller, Erik ;

Kautz, Jan .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9000-9008

[32]

Kauvar I, 2015, ACM T GRAPHIC, V34, DOI [10.1145/2816795.2818070, 10.1145/2682631]

[33]

Kingma DP, 2014, ADV NEUR IN, V27

[34] Sampling Based Scene-Space Video Processing [J].

Klose, Felix ;

Wang, Oliver ;

Bazin, Jean-Charles ;

Magnor, Marcus ;

Sorkine-Hornung, Alexander .

ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04)

[35] Monocular Dense 3D Reconstruction of a Complex Dynamic Scene from Two Perspective Frames [J].

Kumar, Suryansh ;

Dai, Yuchao ;

Li, Hongdong .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4659-4667

[36]

Levoy M., 1996, Computer Graphics Proceedings. SIGGRAPH '96, P31, DOI 10.1145/237170.237199

[37] Crowdsampling the Plenoptic Function [J].

Li, Zhengqi ;

Xian, Wenqi ;

Davis, Abe ;

Snavely, Noah .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :178-196

[38] Learning the Depths of Moving People by Watching Frozen People [J].

Li, Zhengqi ;

Dekel, Tali ;

Cole, Forrester ;

Tucker, Richard ;

Snavely, Noah ;

Liu, Ce ;

Freeman, William T. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4516-4525

[39] Neural Volumes: Learning Dynamic Renderable Volumes from images [J].

Lombardi, Stephen ;

Simon, Tomas ;

Saragih, Jason ;

Schwartz, Gabriel ;

Lehrmann, Andreas ;

Sheikh, Yaser .

ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)

[40] Layered Neural Rendering for Retiming People in Video [J].

Lu, Erika ;

Cole, Forrester ;

Dekel, Tali ;

Xie, Weidi ;

Zisserman, Andrew ;

Salesin, David ;

Freeman, William T. ;

Rubinstein, Michael .

ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06)

← 1 2 3 4 5 6 7 8 9 →