Self-guided Transformer for Video Super-Resolution

被引：0

作者：

Xue, Tong ^{[1
]}

Wang, Qianrui ^{[1
]}

Huang, Xinyi ^{[1
]}

Li, Dengshi ^{[1
]}

机构：

[1] Jianghan Univ, Sch Artificial Intelligence, Wuhan 430056, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X | 2024年 / 14434卷

关键词：

Video super-resolution; Self-guided transformer; Multi-headed self-attention based on offset-guided window; Feature; aggregation;

D O I：

10.1007/978-981-99-8549-4_16

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The challenge of video super-resolution (VSR) is to leverage the long-range spatial-temporal correlation between low-resolution (LR) frames to generate high-resolution (LR) video frames. However, CNN-based video super-resolution approaches show limitations in modeling using long-range dependencies and non-local self-similarity. In this paper. For further spatio-temporal learning, we propose a novel self-guided transformer for video super-resolution (SGTVSR). In this framework, we customize a multi-headed self-attention based on offset-guided window (OGW-MSA). For each query element on a low-resolution reference frame, the OGW-MSA enjoys offset guidance to globally sample highly relevant key elements throughout the video. In addition, we propose a feature aggregation module that aggregates the favorable spatial information of adjacent frame features at different scales as a way to improve the video reconstruction quality. Comprehensive experiments show that our proposed self-guided transformer for video super-resolution outperforms the state-of-the-art (SOTA) method on several public datasets and produces good results visually.

引用

页码：186 / 198

页数：13

共 50 条

[1] Self-guided Transformer for Video Super-Resolution
Xue, Tong
Wang, Qianrui
Huang, Xinyi
Li, Dengshi
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2024, 14434 LNCS : 186 - 198
[2] Deformable transformer for endoscopic video super-resolution
Song, Xiaowei
Tang, Hui
Yang, Chunfeng
Zhou, Guangquan
Wang, Yangang
Huang, Xinjun
Hua, Jie
Coatrieux, Gouenou
He, Xiaopu
Chen, Yang
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 77
[3] A New Dataset and Transformer for Stereoscopic Video Super-Resolution
Imani, Hassan
Islam, Md Baharul
Wong, Lai-Kuan
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 705 - 714
[4] Video Coding With Key Frames Guided Super-Resolution
Zhou, Qiang
Song, Li
Zhang, Wenjun
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2010, PT II, 2010, 6298 : 309 - 318
[5] Learning Trajectory-Aware Transformer for Video Super-Resolution
Liu, Chengxu
Yang, Huan
Fu, Jianlong
Qian, Xueming
arXiv, 2022,
[6] Learning Trajectory-Aware Transformer for Video Super-Resolution
Liu, Chengxu
Yang, Huan
Fu, Jianlong
Qian, Xueming
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5677 - 5686
[7] Learning Trajectory-Aware Transformer for Video Super-Resolution
Liu, Chengxu
Yang, Huan
Fu, Jianlong
Qian, Xueming
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June : 5677 - 5686
[8] Attention-guided video super-resolution with recurrent multi-scale spatial–temporal transformer
Wei Sun
Xianguang Kong
Yanning Zhang
Complex & Intelligent Systems, 2023, 9 : 3989 - 4002
[9] Semantically Guided Efficient Attention Transformer for Face Super-Resolution Tasks
Han, Cong
Gui, Youqiang
Cheng, Peng
You, Zhisheng
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2025, 21 (01)
[10] Fully Cross-Attention Transformer for Guided Depth Super-Resolution
Ariav, Ido
Cohen, Israel
SENSORS, 2023, 23 (05)

← 1 2 3 4 5 →