Cross-scale hierarchical spatio-temporal transformer for video enhancement

被引:0
|
作者
Jiang, Qin [1 ,2 ,3 ]
Wang, Qinglin [1 ,2 ,3 ]
Chi, Lihua [4 ]
Liu, Jie [1 ,2 ,3 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
[2] Lab Digitizing Software Frontier Equipment, Changsha, Peoples R China
[3] Sci & Technol Parallel & Distributed Proc Lab, Changsha, Peoples R China
[4] Hunan GuoKe Computil Technol Co Ltd, Changsha, Peoples R China
关键词
Video super-resolution; Denoising; Deblurring; Transformer; Temporal;
D O I
10.1016/j.knosys.2024.112773
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The diversity and complexity of degradations in low-quality videos pose non-trivial challenges on video enhancement to reconstruct the high-quality counterparts. Prevailing sliding window based methods represent poor performance due to the limitation of window size. Recurrent networks take advantage of long-term modeling to aggregate more information, resulting insignificant performance improvements. However, most of them are trained on simple degraded data and can only tackle specific degradation. To break through the limitation, we propose a progressive alignment network, namely Cross-scale Hierarchical Spatio-Temporal Transformer (CHSTT), which leverages cross-scale tokenization to generate multi-scale visual tokens in the entire sequence to capture extensive long-range temporal dependencies. To enhance the spatial and temporal interactions, we introduce an innovative hierarchical Transformer, facilitating the computation of mutual multi-head attention across both spatial and temporal dimensions. Quantitative and qualitative assessments substantiate the superior performance of CHSTT compared to several state-of-the-art benchmarks across three distinct video enhancement tasks, including video super-resolution, video denoising, and video deblurring.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] MSTG: Multi-Scale Transformer with Gradient for joint spatio-temporal enhancement
    Lin, Xin
    Chen, Junli
    Ai, Shaojie
    Liu, Jing
    Li, Bochao
    Li, Qingying
    Ma, Rui
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 102
  • [2] Spatio-Temporal Transformer Network for Video Restoration
    Kim, Tae Hyun
    Sajjadi, Mehdi S. M.
    Hirsch, Michael
    Schoelkopf, Bernhard
    COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 111 - 127
  • [3] Transformer with Spatio-Temporal Representation for Video Anomaly Detection
    Sun, Xiaohu
    Chen, Jinyi
    Shen, Xulin
    Li, Hongjun
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 213 - 222
  • [4] Spatio-Temporal Scale Selection in Video Data
    Tony Lindeberg
    Journal of Mathematical Imaging and Vision, 2018, 60 : 525 - 562
  • [5] Spatio-Temporal Scale Selection in Video Data
    Lindeberg, Tony
    JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2018, 60 (04) : 525 - 562
  • [6] Parallel Spatio-Temporal Attention Transformer for Video Frame Interpolation
    Ning, Xin
    Cai, Feifan
    Li, Yuhang
    Ding, Youdong
    ELECTRONICS, 2024, 13 (10)
  • [7] Cross-Scale KNN Image Transformer for Image Restoration
    Lee, Hunsang
    Choi, Hyesong
    Sohn, Kwanghoon
    Min, Dongbo
    IEEE ACCESS, 2023, 11 : 13013 - 13027
  • [8] Multiple Hierarchical Cross-Scale Transformer for Remote Sensing Scene Classification
    Zhang, Dan
    Ma, Wenping
    Jiao, Licheng
    Liu, Xu
    Yang, Yuting
    Liu, Fang
    REMOTE SENSING, 2025, 17 (01)
  • [9] Neural Video Compression with Spatio-Temporal Cross-Covariance Transformers
    Chen, Zhenghao
    Relic, Lucas
    Azevedo, Roberto
    Zhang, Yang
    Gross, Markus
    Xu, Dong
    Zhou, Luping
    Schroers, Christopher
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8543 - 8551
  • [10] Hierarchical Spatio-Temporal Graph Convolutional Networks and Transformer Network for Traffic Flow Forecasting
    Huo, Guangyu
    Zhang, Yong
    Wang, Boyue
    Gao, Junbin
    Hu, Yongli
    Yin, Baocai
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (04) : 3855 - 3867