Self-guided Transformer for Video Super-Resolution

被引:0
|
作者
Xue, Tong [1 ]
Wang, Qianrui [1 ]
Huang, Xinyi [1 ]
Li, Dengshi [1 ]
机构
[1] Jianghan Univ, Sch Artificial Intelligence, Wuhan 430056, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X | 2024年 / 14434卷
关键词
Video super-resolution; Self-guided transformer; Multi-headed self-attention based on offset-guided window; Feature; aggregation;
D O I
10.1007/978-981-99-8549-4_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The challenge of video super-resolution (VSR) is to leverage the long-range spatial-temporal correlation between low-resolution (LR) frames to generate high-resolution (LR) video frames. However, CNN-based video super-resolution approaches show limitations in modeling using long-range dependencies and non-local self-similarity. In this paper. For further spatio-temporal learning, we propose a novel self-guided transformer for video super-resolution (SGTVSR). In this framework, we customize a multi-headed self-attention based on offset-guided window (OGW-MSA). For each query element on a low-resolution reference frame, the OGW-MSA enjoys offset guidance to globally sample highly relevant key elements throughout the video. In addition, we propose a feature aggregation module that aggregates the favorable spatial information of adjacent frame features at different scales as a way to improve the video reconstruction quality. Comprehensive experiments show that our proposed self-guided transformer for video super-resolution outperforms the state-of-the-art (SOTA) method on several public datasets and produces good results visually.
引用
收藏
页码:186 / 198
页数:13
相关论文
共 50 条
  • [1] Self-guided Transformer for Video Super-Resolution
    Xue, Tong
    Wang, Qianrui
    Huang, Xinyi
    Li, Dengshi
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2024, 14434 LNCS : 186 - 198
  • [2] Deformable transformer for endoscopic video super-resolution
    Song, Xiaowei
    Tang, Hui
    Yang, Chunfeng
    Zhou, Guangquan
    Wang, Yangang
    Huang, Xinjun
    Hua, Jie
    Coatrieux, Gouenou
    He, Xiaopu
    Chen, Yang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 77
  • [3] A New Dataset and Transformer for Stereoscopic Video Super-Resolution
    Imani, Hassan
    Islam, Md Baharul
    Wong, Lai-Kuan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 705 - 714
  • [4] Video Coding With Key Frames Guided Super-Resolution
    Zhou, Qiang
    Song, Li
    Zhang, Wenjun
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2010, PT II, 2010, 6298 : 309 - 318
  • [5] Learning Trajectory-Aware Transformer for Video Super-Resolution
    Liu, Chengxu
    Yang, Huan
    Fu, Jianlong
    Qian, Xueming
    arXiv, 2022,
  • [6] Learning Trajectory-Aware Transformer for Video Super-Resolution
    Liu, Chengxu
    Yang, Huan
    Fu, Jianlong
    Qian, Xueming
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5677 - 5686
  • [7] Learning Trajectory-Aware Transformer for Video Super-Resolution
    Liu, Chengxu
    Yang, Huan
    Fu, Jianlong
    Qian, Xueming
    Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June : 5677 - 5686
  • [8] Attention-guided video super-resolution with recurrent multi-scale spatial–temporal transformer
    Wei Sun
    Xianguang Kong
    Yanning Zhang
    Complex & Intelligent Systems, 2023, 9 : 3989 - 4002
  • [9] Semantically Guided Efficient Attention Transformer for Face Super-Resolution Tasks
    Han, Cong
    Gui, Youqiang
    Cheng, Peng
    You, Zhisheng
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2025, 21 (01)
  • [10] Fully Cross-Attention Transformer for Guided Depth Super-Resolution
    Ariav, Ido
    Cohen, Israel
    SENSORS, 2023, 23 (05)