SUPERVEGAN: Super Resolution Video Enhancement GAN for Perceptually Improving Low Bitrate Streams

被引:5
作者
Andrei, Silviu S. [1 ]
Shapovalova, Nataliya [1 ]
Mayol-Cuevas, Walterio [1 ]
机构
[1] Amazon, Seattle, WA 98104 USA
来源
IEEE ACCESS | 2021年 / 9卷
关键词
Streaming media; Image resolution; Bit rate; Image coding; Training; Task analysis; Measurement; Video super resolution; artifact removal; video enhancement;
D O I
10.1109/ACCESS.2021.3090344
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel model family that we call SUPERVEGAN, for the problem of video enhancement for low bitrate streams by simultaneous video super resolution and removal of compression artifacts from low bitrates (e.g. 250Kbps). Our strategy is fully end-to-end, but we upsample and tackle the problem in two main stages. The first stage deals with removal of streaming compression artifacts and performs a partial upsampling, and the second stage performs the final upsampling and adds detail generatively. We also use a novel progressive training strategy for video together with the use of perceptual metrics. Our experiments shown resilience to training bitrate and we show how to derive real-time models. We also introduce a novel bitrate equivalency test that enables the assessment of how much a model improves streams with respect to bitrate. We demonstrate efficacy on two publicly available HD datasets, LIVE-NFLX-II and Tears of Steel (TOS). We compare against a range of baselines and encoders and our results demonstrate our models achieve a perceptual equivalence which is up to two times over the input bitrate. In particular our 4X upsampling outperforms baseline methods on the LPIPS perceptual metric, and our 2X upsampling model also outperforms baselines on traditional metrics such as PSNR.
引用
收藏
页码:91160 / 91174
页数:15
相关论文
共 34 条
  • [1] [Anonymous], 2018, P NEURIPS
  • [2] [Anonymous], 2013, TEARS OF STEEL
  • [3] Evolution strategies – A comprehensive introduction
    Hans-Georg Beyer
    Hans-Paul Schwefel
    [J]. Natural Computing, 2002, 1 (1) : 3 - 52
  • [4] Bampis Christos G, 2018, ARXIV180803898
  • [5] MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement
    Bao, Wenbo
    Lai, Wei-Sheng
    Zhang, Xiaoyun
    Gao, Zhiyong
    Yang, Ming-Hsuan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 933 - 948
  • [6] Blau Y, 2019, PR MACH LEARN RES, V97
  • [7] The Perception-Distortion Tradeoff
    Blau, Yochai
    Michaeli, Tomer
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6228 - 6237
  • [8] Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation
    Caballero, Jose
    Ledig, Christian
    Aitken, Andrew
    Acosta, Alejandro
    Totz, Johannes
    Wang, Zehan
    Shi, Wenzhe
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2848 - 2857
  • [9] Learning Image and Video Compression through Spatial-Temporal Energy Compaction
    Cheng, Zhengxue
    Sun, Heming
    Takeuchi, Masaru
    Katto, Jiro
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10063 - 10072
  • [10] Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
    Chu, Mengyu
    Xie, You
    Mayer, Jonas
    Leal-Taix, Laura
    Thuerey, Nils
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (04):