Temporal Consistency Learning of Inter-Frames for Video Super-Resolution

被引:24
作者
Liu, Meiqin [1 ,2 ]
Jin, Shuo [1 ,2 ]
Yao, Chao [3 ]
Lin, Chunyu [1 ,2 ]
Zhao, Yao [1 ,2 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
[2] Beijing Jiaotong Univ, Beijing Key Lab Adv Informat Sci & Network Technol, Beijing 100044, Peoples R China
[3] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Superresolution; Circuit stability; Optical flow; Image restoration; Image reconstruction; Degradation; Convolution; Bidirectional motion estimation; temporal consistency; self-alignment; video super-resolution; FUSION NETWORK;
D O I
10.1109/TCSVT.2022.3214538
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames. The vital operation is to utilize the relative misaligned frames for the current frame reconstruction and preserve the consistency of the results. Existing methods generally explore information propagation and frame alignment to improve the performance of VSR. However, few studies focus on the temporal consistency of inter-frames. In this paper, we propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos. A spatio-temporal stability module is designed to learn the self-alignment from inter-frames. Especially, the correlative matching is employed to exploit the spatial dependency from each frame to maintain structural stability. Moreover, a self-attention mechanism is utilized to learn the temporal correspondence to implement an adaptive warping operation for temporal consistency among multi-frames. Besides, a hybrid recurrent architecture is designed to leverage short-term and long-term information. We further present a progressive fusion module to perform a multistage fusion of spatio-temporal features. And the final reconstructed frames are refined by these fused features. Objective and subjective results of various experiments demonstrate that TCNet has superior performance on different benchmark datasets, compared to several state-of-the-art methods.
引用
收藏
页码:1507 / 1520
页数:14
相关论文
共 55 条
[1]  
Arnab Anurag, 2021, arXiv
[2]   Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation [J].
Caballero, Jose ;
Ledig, Christian ;
Aitken, Andrew ;
Acosta, Alejandro ;
Totz, Johannes ;
Wang, Zehan ;
Shi, Wenzhe .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2848-2857
[3]  
Cao J., 2021, arXiv
[4]  
Chan K.S., 2020, ARXIV
[5]   BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond [J].
Chan, Kelvin C. K. ;
Wang, Xintao ;
Yu, Ke ;
Dong, Chao ;
Loy, Chen Change .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4945-4954
[6]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[7]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[8]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149
[9]  
Fuoli D., 2022, ARXIV
[10]   Efficient Video Super-Resolution through Recurrent Latent Space Propagation [J].
Fuoli, Dario ;
Gu, Shuhang ;
Timofte, Radu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :3476-3485