DeepFovea: Neural Reconstruction for Foveated Rendering and Video Compression using Learned Statistics of Natural Videos

被引:89
作者
Kaplanyan, Anton S. [1 ]
Sochenov, Anton [1 ]
Leimkuhler, Thomas [1 ]
Okunev, Mikhail [1 ]
Goodall, Todd [1 ]
Rufo, Gizem [1 ]
机构
[1] Facebook Real Labs, MPI Informat, Menlo Pk, CA 94025 USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2019年 / 38卷 / 06期
关键词
generative networks; perceptual rendering; foveated rendering; deep learning; virtual reality; gaze-contingent rendering; video compression; video generation; QUALITY ASSESSMENT; CONTRAST-SENSITIVITY; PERCEPTION;
D O I
10.1145/3355089.3356557
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In order to provide an immersive visual experience, modern displays require head mounting, high image resolution, low latency, as well as high refresh rate. This poses a challenging computational problem. On the other hand, the human visual system can consume only a tiny fraction of this video stream due to the drastic acuity loss in the peripheral vision. Foveated rendering and compression can save computations by reducing the image quality in the peripheral vision. However, this can cause noticeable artifacts in the periphery, or, if done conservatively, would provide only modest savings. In this work, we explore a novel foveated reconstruction method that employs the recent advances in generative adversarial neural networks. We reconstruct a plausible peripheral video from a small fraction of pixels provided every frame. The reconstruction is done by finding the closest matching video to this sparse input stream of pixels on the learned manifold of natural videos. Our method is more efficient than the state-of-the-art foveated rendering, while providing the visual experience with no noticeable quality degradation. We conducted a user study to validate our reconstruction method and compare it against existing foveated rendering and video compression techniques. Our method is fast enough to drive gaze-contingent head-mounted displays in real time on modern hardware. We plan to publish the trained network to establish a new quality bar for foveated rendering and compression as well as encourage follow-up research.
引用
收藏
页数:13
相关论文
共 65 条
  • [1] Abu-El-Haija S., 2016, ARXIV
  • [2] Arjovsky M, 2017, PR MACH LEARN RES, V70
  • [3] Ba J. L., 2016, ARXIV160706450CSSTAT
  • [4] Bampis Christos G, 2018, ARXIV180803898
  • [5] Recycle-GAN: Unsupervised Video Retargeting
    Bansal, Aayush
    Ma, Shugao
    Ramanan, Deva
    Sheikh, Yaser
    [J]. COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 122 - 138
  • [6] Retina-V1 model of detectability across the visual field
    Bradley, Chris
    Abrams, Jared
    Geisler, Wilson S.
    [J]. JOURNAL OF VISION, 2014, 14 (12):
  • [7] Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder
    Chaitanya, Chakravarty R. Alla
    Kaplanyan, Anton S.
    Schied, Christoph
    Salvi, Marco
    Lefohn, Aaron
    Nowrouzezahrai, Derek
    Aila, Timo
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):
  • [8] Video quality assessment accounting for temporal visual masking of local flicker
    Choi, Lark Kwon
    Bovik, Alan Conrad
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 67 : 182 - 198
  • [9] Clevert D.-A., 2016, ICLR
  • [10] STOCHASTIC SAMPLING IN COMPUTER-GRAPHICS
    COOK, RL
    [J]. ACM TRANSACTIONS ON GRAPHICS, 1986, 5 (01): : 51 - 72