MadFormer: multi-attention-driven image super-resolution method based on Transformer

被引：4

作者：

Liu, Beibei ^{[1
]}

Sun, Jing ^{[1
]}

Zhu, Bing ^{[2
]}

Li, Ting ^{[1
]}

Sun, Fuming ^{[1
]}

机构：

[1] Dalian Minzu Univ, Sch Informat & Commun Engn, Liaohe West Rd, Dalian 116600, Liaoning, Peoples R China

[2] Harbin Inst Technol, Sch Elect & Informat Engn, Xidazhi St, Harbin 150006, Heilongjiang, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Image super-resolution; Transformer; Multi-attention-driven; Dynamic fusion;

D O I：

10.1007/s00530-024-01276-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

While the Transformer-based method has demonstrated exceptional performance in low-level visual processing tasks, it has a strong modeling ability only locally, thereby neglecting the importance of spatial feature information and high-frequency details within the channel for super-resolution. To enhance feature information and improve the visual experience, we propose a multi-attention-driven image super-resolution method based on a Transformer network, called MadFormer. Initially, the low-resolution image undergoes an initial convolution operation to extract shallow features while being fed into a residual multi-attention block incorporating channel attention, spatial attention, and self-attention mechanisms. By employing multi-head self-attention, the proposed method aims to capture global-local feature information; channel attention and spatial attention are utilized to effectively capture high-frequency features in both the channel and spatial domains. Subsequently, deep feature information is inputted into a dynamic fusion block that dynamically fuses multi-attention extracted features, facilitating the aggregation of cross-window information. Ultimately, the shallow and deep feature information is fused via convolution operations, yielding high-resolution images through high-quality reconstruction. Comprehensive quantitative and qualitative comparisons with other advanced algorithms demonstrate the substantial advantages of the proposed approach in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) for image super-resolution.

引用

页数：11

共 50 条

[31] Transformer-based image super-resolution and its lightweight
Zhang, Dongxiao
Qi, Tangyao
Gao, Juhao
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (26) : 68625 - 68649
[32] Lightweight Wavelet-Based Transformer for Image Super-Resolution
Ran, Jinye
Zhang, Zili
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 368 - 382
[33] Efficient image super-resolution based on transformer with bidirectional interaction
Gendy, Garas
He, Guanghui
Sabor, Nabil
APPLIED SOFT COMPUTING, 2024, 165
[34] A Transformer-Based Model for Super-Resolution of Anime Image
Xu, Shizhuo
Dutta, Vibekananda
He, Xin
Matsumaru, Takafumi
SENSORS, 2022, 22 (21)
[35] Structured image super-resolution network based on improved Transformer
Lv X.-D.
Li J.
Deng Z.-N.
Feng H.
Cui X.-T.
Deng H.-X.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (05): : 865 - 874+910
[36] A multi-frame image super-resolution method
Li, Xuelong
Hu, Yanting
Gao, Xinbo
Tao, Dacheng
Ning, Beijia
SIGNAL PROCESSING, 2010, 90 (02) : 405 - 414
[37] Image Super-Resolution Reconstruction Method Based on Lightweight Symmetric CNN-Transformer
Wang, Tingwei
Zhao, Jianwei
Zhou, Zhenghua
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (07): : 626 - 637
[38] Spatial relaxation transformer for image super-resolution
Li, Yinghua
Zhang, Ying
Zeng, Hao
He, Jinglu
Guo, Jie
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (07)
[39] Single image super-resolution based on multi-scale dense attention network
Gao, Farong
Wang, Yong
Yang, Zhangyi
Ma, Yuliang
Zhang, Qizhong
SOFT COMPUTING, 2023, 27 (06) : 2981 - 2992
[40] Attention-based multi-image super-resolution reconstruction for remote sensing
Ding, Xueyan
Wang, Wenshan
Zhang, Bingbing
Zhang, Jianxin
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (06)

← 1 2 3 4 5 →