SUPERVEGAN: Super Resolution Video Enhancement GAN for Perceptually Improving Low Bitrate Streams

被引:7
作者
Andrei, Silviu S. [1 ]
Shapovalova, Nataliya [1 ]
Mayol-Cuevas, Walterio [1 ]
机构
[1] Amazon, Seattle, WA 98104 USA
关键词
Streaming media; Image resolution; Bit rate; Image coding; Training; Task analysis; Measurement; Video super resolution; artifact removal; video enhancement;
D O I
10.1109/ACCESS.2021.3090344
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel model family that we call SUPERVEGAN, for the problem of video enhancement for low bitrate streams by simultaneous video super resolution and removal of compression artifacts from low bitrates (e.g. 250Kbps). Our strategy is fully end-to-end, but we upsample and tackle the problem in two main stages. The first stage deals with removal of streaming compression artifacts and performs a partial upsampling, and the second stage performs the final upsampling and adds detail generatively. We also use a novel progressive training strategy for video together with the use of perceptual metrics. Our experiments shown resilience to training bitrate and we show how to derive real-time models. We also introduce a novel bitrate equivalency test that enables the assessment of how much a model improves streams with respect to bitrate. We demonstrate efficacy on two publicly available HD datasets, LIVE-NFLX-II and Tears of Steel (TOS). We compare against a range of baselines and encoders and our results demonstrate our models achieve a perceptual equivalence which is up to two times over the input bitrate. In particular our 4X upsampling outperforms baseline methods on the LPIPS perceptual metric, and our 2X upsampling model also outperforms baselines on traditional metrics such as PSNR.
引用
收藏
页码:91160 / 91174
页数:15
相关论文
共 34 条
[1]  
[Anonymous], 2013, TEARS OF STEEL
[2]  
Arnold DV, 2002, IEEE T EVOLUT COMPUT, V6, P30, DOI [10.1109/4235.985690, 10.1023/A:1015059928466]
[3]  
Bampis C. G., 2018, ARXIV180803898
[4]   MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement [J].
Bao, Wenbo ;
Lai, Wei-Sheng ;
Zhang, Xiaoyun ;
Gao, Zhiyong ;
Yang, Ming-Hsuan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) :933-948
[5]  
Blau Y, 2019, PR MACH LEARN RES, V97
[6]   The Perception-Distortion Tradeoff [J].
Blau, Yochai ;
Michaeli, Tomer .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6228-6237
[7]   Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation [J].
Caballero, Jose ;
Ledig, Christian ;
Aitken, Andrew ;
Acosta, Alejandro ;
Totz, Johannes ;
Wang, Zehan ;
Shi, Wenzhe .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2848-2857
[8]   Learning Image and Video Compression through Spatial-Temporal Energy Compaction [J].
Cheng, Zhengxue ;
Sun, Heming ;
Takeuchi, Masaru ;
Katto, Jiro .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10063-10072
[9]   Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation [J].
Chu, Mengyu ;
Xie, You ;
Mayer, Jonas ;
Leal-Taix, Laura ;
Thuerey, Nils .
ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (04)
[10]  
Cisco, 2019, White Paper