DeepFovea: Neural Reconstruction for Foveated Rendering and Video Compression using Learned Statistics of Natural Videos

被引：89

作者：

Kaplanyan, Anton S. ^{[1
]}

Sochenov, Anton ^{[1
]}

Leimkuhler, Thomas ^{[1
]}

Okunev, Mikhail ^{[1
]}

Goodall, Todd ^{[1
]}

Rufo, Gizem ^{[1
]}

机构：

[1] Facebook Real Labs, MPI Informat, Menlo Pk, CA 94025 USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2019年 / 38卷 / 06期

关键词：

generative networks; perceptual rendering; foveated rendering; deep learning; virtual reality; gaze-contingent rendering; video compression; video generation; QUALITY ASSESSMENT; CONTRAST-SENSITIVITY; PERCEPTION;

D O I：

10.1145/3355089.3356557

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In order to provide an immersive visual experience, modern displays require head mounting, high image resolution, low latency, as well as high refresh rate. This poses a challenging computational problem. On the other hand, the human visual system can consume only a tiny fraction of this video stream due to the drastic acuity loss in the peripheral vision. Foveated rendering and compression can save computations by reducing the image quality in the peripheral vision. However, this can cause noticeable artifacts in the periphery, or, if done conservatively, would provide only modest savings. In this work, we explore a novel foveated reconstruction method that employs the recent advances in generative adversarial neural networks. We reconstruct a plausible peripheral video from a small fraction of pixels provided every frame. The reconstruction is done by finding the closest matching video to this sparse input stream of pixels on the learned manifold of natural videos. Our method is more efficient than the state-of-the-art foveated rendering, while providing the visual experience with no noticeable quality degradation. We conducted a user study to validate our reconstruction method and compare it against existing foveated rendering and video compression techniques. Our method is fast enough to drive gaze-contingent head-mounted displays in real time on modern hardware. We plan to publish the trained network to establish a new quality bar for foveated rendering and compression as well as encourage follow-up research.

引用

页数：13

共 65 条

[1] Abu-El-Haija S., 2016, ARXIV
[2] Arjovsky M, 2017, PR MACH LEARN RES, V70
[3] Ba J. L., 2016, ARXIV160706450CSSTAT
[4] Bampis Christos G, 2018, ARXIV180803898
[5] Recycle-GAN: Unsupervised Video Retargeting
Bansal, Aayush
Ma, Shugao
Ramanan, Deva
Sheikh, Yaser
[J]. COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 122 - 138
[6] Retina-V1 model of detectability across the visual field
Bradley, Chris
Abrams, Jared
Geisler, Wilson S.
[J]. JOURNAL OF VISION, 2014, 14 (12):
[7] Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder
Chaitanya, Chakravarty R. Alla
Kaplanyan, Anton S.
Schied, Christoph
Salvi, Marco
Lefohn, Aaron
Nowrouzezahrai, Derek
Aila, Timo
[J]. ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):
[8] Video quality assessment accounting for temporal visual masking of local flicker
Choi, Lark Kwon
Bovik, Alan Conrad
[J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 67 : 182 - 198
[9] Clevert D.-A., 2016, ICLR
[10] STOCHASTIC SAMPLING IN COMPUTER-GRAPHICS
COOK, RL
[J]. ACM TRANSACTIONS ON GRAPHICS, 1986, 5 (01): : 51 - 72

← 1 2 3 4 5 6 7 →