HAAT: Hybrid Attention Aggregation Transformer for Image Super-Resolution

被引：0

作者：

Lai, Song-Jiang ^{[1
,2
]}

Cheung, Tsun-Hin ^{[1
,2
]}

Fung, Ka-Chun ^{[1
,2
]}

Xue, Kai-Wen ^{[1
,2
]}

Lam, Kin-Man ^{[1
,2
]}

机构：

[1] Ctr Adv Reliabil & Safety New Territories, Hong Kong, Peoples R China

[2] Hong Kong Polytechn Univ, Dept Elect & Elect Engn, Kowloon, Hong Kong, Peoples R China

来源：

INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2025 | 2025年 / 13510卷

关键词：

Image super-resolution; Computer vision; Attention mechanism; Transformer;

D O I：

10.1117/12.3058003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the research area of image super-resolution, Swin-transformer-based models are favored for their global spatial modeling and shifting window attention mechanism. However, existing methods often limit self-attention to non-overlapping windows to cut costs and ignore the useful information that exists across channels. To address this issue, this paper introduces a novel model, the Hybrid Attention Aggregation Transformer (HAAT), designed to better leverage feature information. HAAT is constructed by integrating Swin-Dense-Residual-Connected Blocks (SDRCB) with Hybrid Grid Attention Blocks (HGAB). SDRCB expands the receptive field while maintaining a streamlined architecture, resulting in enhanced performance. HGAB incorporates channel attention, sparse attention, and window attention to improve nonlocal feature fusion and achieve more visually compelling results. Experimental evaluations demonstrate that HAAT surpasses state-of-the-art methods on benchmark datasets.

引用

页数：6

共 22 条

[1] NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study [J].

Agustsson, Eirikur ;

Timofte, Radu .

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1122-1131

[2]

[Anonymous], 2012, Curves and Surfaces, DOI DOI 10.1007/978-3-642-27413-847

[3] Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding [J].

Bevilacqua, Marco ;

Roumy, Aline ;

Guillemot, Christine ;

Morel, Marie-Line Alberi .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,

[4] Activating More Pixels in Image Super-Resolution Transformer [J].

Chen, Xiangyu ;

Wang, Xintao ;

Zhou, Jiantao ;

Qiao, Yu ;

Dong, Chao .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22367-22377

[5] Dual Aggregation Transformer for Image Super-Resolution [J].

Chen, Zheng ;

Zhang, Yulun ;

Gu, Jinjin ;

Kong, Linghe ;

Yang, Xiaokang ;

Yu, Fisher .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :12278-12287

[6]

Chen Zheng, 2022, Advances in Neural Information Processing Systems

[7]

Chu SC, 2024, Arxiv, DOI arXiv:2405.05001

[8] Second-order Attention Network for Single Image Super-Resolution [J].

Dai, Tao ;

Cai, Jianrui ;

Zhang, Yongbing ;

Xia, Shu-Tao ;

Zhang, Lei .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11057-11066

[9]

HANSON KM, 1993, P SOC PHOTO-OPT INS, V1898, P716, DOI 10.1117/12.154577

[10]

Hsu CC, 2024, Arxiv, DOI arXiv:2404.00722

← 1 2 3 →