Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

被引:549
作者
Caballero, Jose [1 ]
Ledig, Christian [1 ]
Aitken, Andrew [1 ]
Acosta, Alejandro [1 ]
Totz, Johannes [1 ]
Wang, Zehan [1 ]
Shi, Wenzhe [1 ]
机构
[1] Twitter, San Francisco, CA 94103 USA
来源
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年
关键词
IMAGE SUPERRESOLUTION; QUALITY ASSESSMENT;
D O I
10.1109/CVPR.2017.304
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal sub-pixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the joint processing of multiple consecutive video frames. We also propose a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods, relying on a fast multi-resolution spatial transformer module that is end-to-end trainable. These contributions provide both higher accuracy and temporally more consistent videos, which we confirm qualitatively and quantitatively. Relative to single-frame models, spatio-temporal networks can either reduce the computational cost by 30% whilst maintaining the same quality or provide a 0.2dB gain for a similar computational cost. Results on publicly available datasets demonstrate that the proposed algorithms surpass current state-of-the-art performance in both accuracy and efficiency.
引用
收藏
页码:2848 / 2857
页数:10
相关论文
共 39 条
[21]   Learning-Based View Synthesis for Light Field Cameras [J].
Kalantari, Nima Khademi ;
Wang, Ting-Chun ;
Ramamoorthi, Ravi .
ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (06)
[22]   Video Super-Resolution With Convolutional Neural Networks [J].
Kappeler, Armin ;
Yoo, Seunghwan ;
Dai, Qiqin ;
Katsaggelos, Aggelos K. .
IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2016, 2 (02) :109-122
[23]   Large-scale Video Classification with Convolutional Neural Networks [J].
Karpathy, Andrej ;
Toderici, George ;
Shetty, Sanketh ;
Leung, Thomas ;
Sukthankar, Rahul ;
Fei-Fei, Li .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1725-1732
[24]  
Kim J., 2016, IEEE COMPUTER VISION
[25]  
Kingma Diederik P., 2014, arXiv
[26]   Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network [J].
Ledig, Christian ;
Theis, Lucas ;
Huszar, Ferenc ;
Caballero, Jose ;
Cunningham, Andrew ;
Acosta, Alejandro ;
Aitken, Andrew ;
Tejani, Alykhan ;
Totz, Johannes ;
Wang, Zehan ;
Shi, Wenzhe .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :105-114
[27]  
Liu C., 2015, IEEE C COMP VIS PATT, P209
[28]  
Park SC, 2003, IEEE SIGNAL PROC MAG, V20, P21, DOI 10.1109/MSP.2003.1203207
[29]  
Patraucean V., 2016, ICLR WORKSH
[30]  
Saxe Andrew M, 2014, ICLR