Efficient frequency feature aggregation transformer for image super-resolution

被引：0

作者：

Song, Jianwen ^{[1
,2
]}

Sowmya, Arcot ^{[1
]}

Sun, Changming ^{[1
,2
]}

机构：

[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia

[2] CSIRO Data61, Epping, NSW 1710, Australia

来源：

PATTERN RECOGNITION | 2025年 / 167卷

关键词：

Image super-resolution; Frequency aggregation; Transformer; Feature extraction; ATTENTION; MODULE;

D O I：

10.1016/j.patcog.2025.111735

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although vision transformers have shown remarkable performance in image super-resolution tasks, the key component, i.e., the self-attention mechanism, suffers from insufficient high-frequency information extraction capability and high computational costs, hindering further advancement. To address these limitations, we propose an efficient frequency feature aggregation transformer for single image super-resolution (EFATSR). Specifically, a frequency self-attention aggregation block is proposed to enhance the extraction of high-frequency information. This block incorporates a frequency spatial feature aggregation branch to supplement high-frequency feature extraction for a self-attention branch, enabling the model to capture high-frequency information more effectively. Additionally, a frequency channel-spatial aggregation block is proposed to extract channel and spatial features in the frequency domain, enhancing the efficiency of deep feature extraction. Extensive experiments on single image super-resolution demonstrate that EFATSR achieves stateof-the-art performance while maintaining low computational complexity. Furthermore, we extend EFATSR for stereo image super-resolution by incorporating a multi-head parallax-attention block, forming EFATSSR, which also shows remarkable performance and high efficiency. Source code is avaliable at https://github.com/ jianwensong/EFATSR.

引用

页数：12

共 57 条

[21] Adaptive Frequency Filters As Efficient Global Token Mixers [J].

Huang, Zhipeng ;

Zhang, Zhizheng ;

Lan, Cuiling ;

Zha, Zheng-Jun ;

Lu, Yan ;

Guo, Baining .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, :6026-6036

[22] Lightweight Image Super-Resolution with Information Multi-distillation Network [J].

Hui, Zheng ;

Gao, Xinbo ;

Yang, Yunchu ;

Wang, Xiumei .

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :2024-2032

[23] Hierarchical dense recursive network for image super-resolution [J].

Jiang, Kui ;

Wang, Zhongyuan ;

Yi, Peng ;

Jiang, Junjun .

PATTERN RECOGNITION, 2020, 107

[24] Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring [J].

Kong, Lingshun ;

Dong, Jiangxin ;

Ge, Jianjun ;

Li, Mingqiang ;

Pan, Jinshan .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :5886-5895

[25]

Li JM, 2023, AAAI CONF ARTIF INTE, P1343

[26]

Li Wenbo, 2020, Advances in Neural Information Processing Systems, V33

[27] DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution [J].

Li, Xiang ;

Dong, Jiangxin ;

Tang, Jinhui ;

Pan, Jinshan .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :12746-12755

[28] SwinIR: Image Restoration Using Swin Transformer [J].

Liang, Jingyun ;

Cao, Jiezhang ;

Sun, Guolei ;

Zhang, Kai ;

Van Gool, Luc ;

Timofte, Radu .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :1833-1844

[29] Enhanced Deep Residual Networks for Single Image Super-Resolution [J].

Lim, Bee ;

Son, Sanghyun ;

Kim, Heewon ;

Nah, Seungjun ;

Lee, Kyoung Mu .

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1132-1140

[30] Steformer: Efficient Stereo Image Super-Resolution With Transformer [J].

Lin, Jianxin ;

Yin, Lianying ;

Wang, Yijun .

IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :8396-8407

← 1 2 3 4 5 6 →