Learning a spatial-temporal symmetry network for video super-resolution

被引：1

作者：

Wang, Xiaohang ^{[1
,2
]}

Liu, Mingliang ^{[1
,2
]}

Wei, Pengying ^{[1
,2
]}

机构：

[1] Heilongjiang Univ, Dept Automat, Harbin 150080, Heilongjiang, Peoples R China

[2] Heilongjiang Univ, Key Lab Informat Fus Estimat & Detect, Harbin 150080, Heilongjiang, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 03期

关键词：

Video super-resolution; Motion estimation; Spatial-temporal symmetry; Convolutional neural network; CONVOLUTION;

D O I：

10.1007/s10489-022-03603-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The video super-resolution (VSR) method is designed to estimate and restore high-resolution (HR) sequences from low-resolution (LR) input. For the past few years, many VSR methods with machine learning have been proposed that combine both the convolutional neural network (CNN) and motion compensation. Most mainstream approaches are based on optical flow or deformation convolution, and both need accurate estimates for motion compensation. However, most previous methods have not been able to fully utilize the spatial-temporal symmetrical information from input sequences. Moreover, much computation is consumed by aligning every neighbouring frame to the reference frame separately. Furthermore, many methods reconstruct HR results on only a single scale, which limits the reconstruction accuracy of the network and its performance in complex scenes. In this study, we propose a spatial-temporal symmetry network (STSN) to solve the above deficiencies. STSN includes four parts: prefusion, alignment, postfusion and reconstruction. First, a two-stage fusion strategy is applied to reduce the computation consumption of the network. Furthermore, ConvGRU is utilized in the prefusion module, the redundant features between neighbouring frames are eliminated, and several neighbouring frames are fused and condensed into two parts. To generate accurate offset maps, we present a spatial-temporal symmetry attention block (STSAB). This component exploits the symmetry of spatial-temporal combined spatial attention. In the reconstruction module, we propose an SR multiscale residual block (SR-MSRB) to enhance reconstruction performance. Abundant experimental results that test several datasets show that our method possesses better effects and efficiency in both quantitative and qualitative measurement indices compared with state-of-the-art methods.

引用

页码：3530 / 3544

页数：15

共 50 条

[1]

[Anonymous], 2016, Trans Assoc Comput Linguist, DOI DOI 10.1162/TACLA00097

[2]

[Anonymous], 2015, ARXIV 151106432

[3]

Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]

[4] Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation [J].

Caballero, Jose ;

Ledig, Christian ;

Aitken, Andrew ;

Acosta, Alejandro ;

Totz, Johannes ;

Wang, Zehan ;

Shi, Wenzhe .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2848-2857

[5] Real-Time Super-Resolution System of 4K-Video Based on Deep Learning [J].

Cao, Yanpeng ;

Wang, Chengcheng ;

Song, Changjun ;

Tang, Yongming ;

Li, He .

2021 IEEE 32ND INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2021), 2021, :69-76

[6] Multi-scale feature aggregation network for Image super-resolution [J].

Chen, Wenlong ;

Yao, Pengcheng ;

Gai, Shaoyan ;

Da, Feipeng .

APPLIED INTELLIGENCE, 2022, 52 (04) :3577-3586

[7] Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation [J].

Chu, Mengyu ;

Xie, You ;

Mayer, Jonas ;

Leal-Taix, Laura ;

Thuerey, Nils .

ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (04)

[8] Deformable Convolutional Networks [J].

Dai, Jifeng ;

Qi, Haozhi ;

Xiong, Yuwen ;

Li, Yi ;

Zhang, Guodong ;

Hu, Han ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773

[9] ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks [J].

Ding, Xiaohan ;

Guo, Yuchen ;

Ding, Guiguang ;

Han, Jungong .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1911-1920

[10] Accelerating the Super-Resolution Convolutional Neural Network [J].

Dong, Chao ;

Loy, Chen Change ;

Tang, Xiaoou .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :391-407

← 1 2 3 4 5 →