Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

被引:544
作者
Caballero, Jose [1 ]
Ledig, Christian [1 ]
Aitken, Andrew [1 ]
Acosta, Alejandro [1 ]
Totz, Johannes [1 ]
Wang, Zehan [1 ]
Shi, Wenzhe [1 ]
机构
[1] Twitter, San Francisco, CA 94103 USA
来源
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年
关键词
IMAGE SUPERRESOLUTION; QUALITY ASSESSMENT;
D O I
10.1109/CVPR.2017.304
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal sub-pixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the joint processing of multiple consecutive video frames. We also propose a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods, relying on a fast multi-resolution spatial transformer module that is end-to-end trainable. These contributions provide both higher accuracy and temporally more consistent videos, which we confirm qualitatively and quantitatively. Relative to single-frame models, spatio-temporal networks can either reduce the computational cost by 30% whilst maintaining the same quality or provide a 0.2dB gain for a similar computational cost. Results on publicly available datasets demonstrate that the proposed algorithms surpass current state-of-the-art performance in both accuracy and efficiency.
引用
收藏
页码:2848 / 2857
页数:10
相关论文
共 39 条
[1]  
Ahmadi A, 2016, IEEE IMAGE PROC, P1629, DOI 10.1109/ICIP.2016.7532634
[2]  
[Anonymous], 2015, ADV NEURAL INFPROCES
[3]  
[Anonymous], 2016, P 4 INT C LEARN REPR
[4]   High accuracy optical flow estimation based on a theory for warping [J].
Brox, T ;
Bruhn, A ;
Papenberg, N ;
Weickert, J .
COMPUTER VISION - ECCV 2004, PT 4, 2004, 2034 :25-36
[5]  
Dai QQ, 2015, IEEE IMAGE PROC, P83, DOI 10.1109/ICIP.2015.7350764
[6]   Discrete Wavelet Transform-Based Satellite Image Resolution Enhancement [J].
Demirel, Hasan ;
Anbarjafari, Gholamreza .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2011, 49 (06) :1997-2004
[7]   Accelerating the Super-Resolution Convolutional Neural Network [J].
Dong, Chao ;
Loy, Chen Change ;
Tang, Xiaoou .
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :391-407
[8]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[9]   FlowNet: Learning Optical Flow with Convolutional Networks [J].
Dosovitskiy, Alexey ;
Fischer, Philipp ;
Ilg, Eddy ;
Haeusser, Philip ;
Hazirbas, Caner ;
Golkov, Vladimir ;
van der Smagt, Patrick ;
Cremers, Daniel ;
Brox, Thomas .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2758-2766
[10]  
Dosovitskiy Alexey, 2016, Advances in Neural Information Processing Systems, V29, DOI DOI 10.48550/ARXIV.1602.02644