Spatial and frequency information fusion transformer for image super-resolution

被引:0
|
作者
Zhang, Yan [1 ]
Xu, Fujie [1 ]
Sun, Yemei [1 ]
Wang, Jiao [1 ]
机构
[1] Tianjin Chengjian Univ, Coll Comp & Informat Engn, Tianjin 300384, Peoples R China
关键词
Super resolution; Vision transformer; Frequency components; Convolutional neural network;
D O I
10.1016/j.neunet.2025.107351
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous works have indicated that Transformer-based models bring impressive image reconstruction performance in single image super-resolution (SISR). However, existing Transformer-based approaches utilize self-attention within non-overlapping windows. This restriction hinders the network's ability to adopt large receptive fields, which are essential for capturing global information and establishing long-distance dependencies, especially in the early layers. To fully leverage global information and activate more pixels during the image reconstruction process, we have developed a Spatial and Frequency Information Fusion Transformer (SFFT) with an expansive receptive field. SFFT concurrently combines spatial and frequency domain information to comprehensively leverage their complementary strengths, capturing both local and global image features while integrating low and high-frequency information. Additionally, we utilize the overlapping cross-attention block (OCAB) to facilitate pixel transmission between adjacent windows, enhancing network performance. During the training stage, we incorporate the Fast Fourier Transform (FFT) loss, thereby fully leveraging the capabilities of our proposed modules and further tapping into the model's potential. Extensive quantitative and qualitative evaluations on benchmark datasets indicate that the proposed algorithm surpasses state-of-the-art methods in terms of accuracy. Specifically, our method achieves a PSNR score of 32.67 dB on the Manga109 dataset, surpassing SwinIR by 0.64 dB and HAT by 0.19 dB, respectively. The source code and pre-trained models are available at https://github.com/Xufujie/SFFT
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Group Shuffle and Spectral-Spatial Fusion for Hyperspectral Image Super-Resolution
    Wang, Xinya
    Cheng, Yingsong
    Mei, Xiaoguang
    Jiang, Junjun
    Ma, Jiayi
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2022, 8 : 1223 - 1236
  • [32] Activating More Pixels in Image Super-Resolution Transformer
    Chen, Xiangyu
    Wang, Xintao
    Zhou, Jiantao
    Qiao, Yu
    Dong, Chao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22367 - 22377
  • [33] Steformer: Efficient Stereo Image Super-Resolution With Transformer
    Lin, Jianxin
    Yin, Lianying
    Wang, Yijun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8396 - 8407
  • [34] Efficient mixed transformer for single image super-resolution
    Zheng, Ling
    Zhu, Jinchen
    Shi, Jinpeng
    Weng, Shizhuang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [35] Image Super-Resolution Using Dilated Window Transformer
    Park, Soobin
    Choi, Yong Suk
    IEEE ACCESS, 2023, 11 (60028-60039): : 60028 - 60039
  • [36] ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution
    Zhang, Mingjin
    Zhang, Chi
    Zhang, Qiming
    Guo, Jie
    Gao, Xinbo
    Zhang, Jing
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 23016 - 23027
  • [37] Multi-granularity Transformer for Image Super-Resolution
    Zhuge, Yunzhi
    Jia, Xu
    COMPUTER VISION - ACCV 2022, PT III, 2023, 13843 : 138 - 154
  • [38] Learning Texture Transformer Network for Image Super-Resolution
    Yang, Fuzhi
    Yang, Huan
    Fu, Jianlong
    Lu, Hongtao
    Guo, Baining
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5790 - 5799
  • [39] Efficient Dual Attention Transformer for Image Super-Resolution
    Park, Soobin
    Jeong, Yuna
    Choi, Yong Suk
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 963 - 970
  • [40] Enhancing Image Super-Resolution with Dual Compression Transformer
    Yu, Jiaxing
    Chen, Zheng
    Wang, Jingkai
    Kong, Linghe
    Yan, Jiajie
    Gu, Wei
    VISUAL COMPUTER, 2024, : 4879 - 4892