Efficient frequency feature aggregation transformer for image super-resolution

被引:0
作者
Song, Jianwen [1 ,2 ]
Sowmya, Arcot [1 ]
Sun, Changming [1 ,2 ]
机构
[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
[2] CSIRO Data61, Epping, NSW 1710, Australia
关键词
Image super-resolution; Frequency aggregation; Transformer; Feature extraction; ATTENTION; MODULE;
D O I
10.1016/j.patcog.2025.111735
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although vision transformers have shown remarkable performance in image super-resolution tasks, the key component, i.e., the self-attention mechanism, suffers from insufficient high-frequency information extraction capability and high computational costs, hindering further advancement. To address these limitations, we propose an efficient frequency feature aggregation transformer for single image super-resolution (EFATSR). Specifically, a frequency self-attention aggregation block is proposed to enhance the extraction of high-frequency information. This block incorporates a frequency spatial feature aggregation branch to supplement high-frequency feature extraction for a self-attention branch, enabling the model to capture high-frequency information more effectively. Additionally, a frequency channel-spatial aggregation block is proposed to extract channel and spatial features in the frequency domain, enhancing the efficiency of deep feature extraction. Extensive experiments on single image super-resolution demonstrate that EFATSR achieves stateof-the-art performance while maintaining low computational complexity. Furthermore, we extend EFATSR for stereo image super-resolution by incorporating a multi-head parallax-attention block, forming EFATSSR, which also shows remarkable performance and high efficiency. Source code is avaliable at https://github.com/ jianwensong/EFATSR.
引用
收藏
页数:12
相关论文
共 57 条
[1]   Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network [J].
Ahn, Namhyuk ;
Kang, Byungkon ;
Sohn, Kyung-Ah .
COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :256-272
[2]  
[Anonymous], 2012, Curves and Surfaces, DOI DOI 10.1007/978-3-642-27413-847
[3]   Single image super-resolution based on directional variance attention network [J].
Behjati, Parichehr ;
Rodriguez, Pau ;
Fernandez, Carles ;
Hupont, Isabelle ;
Mehri, Armin ;
Gonzalez, Jordi .
PATTERN RECOGNITION, 2023, 133
[4]   Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding [J].
Bevilacqua, Marco ;
Roumy, Aline ;
Guillemot, Christine ;
Morel, Marie-Line Alberi .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
[5]  
Chen Ke, 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), P1764, DOI 10.1109/CVPRW59228.2023.00177
[6]   Simple Baselines for Image Restoration [J].
Chen, Liangyu ;
Chu, Xiaojie ;
Zhang, Xiangyu ;
Sun, Jian .
COMPUTER VISION, ECCV 2022, PT VII, 2022, 13667 :17-33
[7]   Single image super-resolution based on trainable feature matching attention network [J].
Chen, Qizhou ;
Shao, Qing .
PATTERN RECOGNITION, 2024, 149
[8]   Multi-attention augmented network for single image super-resolution [J].
Chen, Rui ;
Zhang, Heng ;
Liu, Jixin .
PATTERN RECOGNITION, 2022, 122
[9]   Activating More Pixels in Image Super-Resolution Transformer [J].
Chen, Xiangyu ;
Wang, Xintao ;
Zhou, Jiantao ;
Qiao, Yu ;
Dong, Chao .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22367-22377
[10]  
Chen Zheng, 2022, Advances in Neural Information Processing Systems