Bidirectional Multi-scale Deformable Attention for Video Super-Resolution

被引：0

作者：

Zhenghua Zhou

Boxiang Xue

Hai Wang

Jianwei Zhao

机构：

[1] Zhejiang University of Finance and Economics,School of Data Sciences

[2] China Jiliang University,Department of Data Science, College of Sciences

[3] Murdoch University,Discipline of Engineering and Energy

[4] China Jiliang University,College of Information Engineering

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Video super-resolution; Multi-scale deformable convolution; Multi-scale attention; Bidirectional propagation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Video super-resolution aims to generate a high-resolution video frame from its low-resolution video sequences. Video super-resolution is still a challenging problem due to performing the temporal frame alignment and spatial feature fusion during the process of spatial-temporal modeling. Existing deep learning based methods have limitations in handling accurate alignment and effective fusion of frames with multi-scale feature information. In this paper, we propose Bidirectional Multi-scale Deformable Attention (BMDA) for video Super-Resolution in terms of propagation, alignment and fusion. More specifically, the developed Deformable Alignment Module (DAM) in BMDA contains two kinds of modules: Multi-scale Deformable Convolution Module (MDCM) and Multi-scale Attention Module (MAM). MDCM is leveraged to deal with the offset information in different scales and align adjacent frames at the feature level, improving the robustness of the alignment among adjacent frames. MAM is designed to extract the local and global features of the aligned features for aggregation, such that the feature information compensation between pixels is achieved. Additionally, in order to make full use of shallow features, dense connection structure between each layer is adopted in the framework of bidirectional propagation to achieve better visual performance on video super-resolution. In particular, our proposed BDAM outperforms BasicVSR by up to 1.28dB in PSNR when batch size is set to 2. Experimental results on public video benchmark datasets demonstrate that the proposed method can achieve superior performance on large motion videos as compared with the state-of-the art methods.

引用

页码：27809 / 27830

页数：21

共 50 条

[31] Bidirectional Temporal-Recurrent Propagation Networks for Video Super-Resolution [J].

Han, Lei ;

Fan, Cien ;

Yang, Ye ;

Zou, Lian .

ELECTRONICS, 2020, 9 (12) :1-15

[32] Dual Attention with the Self-Attention Alignment for Efficient Video Super-resolution [J].

Chu, Yuezhong ;

Qiao, Yunan ;

Liu, Heng ;

Han, Jungong .

COGNITIVE COMPUTATION, 2022, 14 (03) :1140-1151

[33] Dual Attention with the Self-Attention Alignment for Efficient Video Super-resolution [J].

Yuezhong Chu ;

Yunan Qiao ;

Heng Liu ;

Jungong Han .

Cognitive Computation, 2022, 14 :1140-1151

[34] Space-time super-resolution for satellite video: A joint framework based on multi-scale spatial-temporal transformer [J].

Xiao, Yi ;

Yuan, Qiangqiang ;

He, Jiang ;

Zhang, Qiang ;

Sun, Jing ;

Su, Xin ;

Wu, Jialian ;

Zhang, Liangpei .

INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 108

[35] Bidirectional spatio-temporal generative adversarial network for video super-resolution [J].

Yang, Peng ;

Chen, Zhangquan ;

Sun, Yuankang ;

Hu, Zhongjian ;

Li, Bing .

PATTERN ANALYSIS AND APPLICATIONS, 2025, 28 (01)

[36] Residual Hybrid Attention Enhanced Video Super-Resolution with Cross Convolution [J].

Yuan, Shiqian ;

Li, Boyue ;

Zhao, Xin ;

Lan, Rushi ;

Luo, Xiaonan .

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI, 2025, 15036 :535-549

[37] Video Super-Resolution Method Using Deformable Convolution-Based Alignment Network [J].

Lee, Yooho ;

Cho, Sukhee ;

Jun, Dongsan .

SENSORS, 2022, 22 (21)

[38] DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution [J].

Khan, Anusha ;

Sargano, Allah Bux ;

Habib, Zulfiqar .

MATHEMATICS, 2021, 9 (22)

[39] Lightweight Video Super-Resolution for Compressed Video [J].

Kwon, Ilhwan ;

Li, Jun ;

Prasad, Mukesh .

ELECTRONICS, 2023, 12 (03)

[40] An Efficient Accelerator of Deformable 3D Convolutional Network for Video Super-Resolution [J].

Zhang, Siyu ;

Mao, Wendong ;

Wang, Zhongfeng .

2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, :110-115

← 1 2 3 4 5 →