Joint features-guided linear transformer and CNN for efficient image super-resolution

被引：2

作者：

Wang, Bufan ^{[1
]}

Zhang, Yongjun ^{[1
]}

Long, Wei ^{[1
]}

Cui, Zhongwei ^{[2
]}

机构：

[1] Guizhou Univ, Coll Comp Sci & Technol, State Key Lab Publ Big Data, Guiyang 550025, Guizhou, Peoples R China

[2] Guizhou Educ Univ, Sch Math & Big Data, Guiyang 550018, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2024年 / 15卷 / 12期

关键词：

Image super-resolution; Multi-level contextual information; Linear self-attention; Lightweight network; NETWORK;

D O I：

10.1007/s13042-024-02277-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Integrating convolutional neural networks (CNNs) and transformers has notably improved lightweight single image super-resolution (SISR) tasks. However, existing methods lack the capability to exploit multi-level contextual information, and transformer computations inherently add quadratic complexity. To address these issues, we propose a Joint features-Guided Linear Transformer and CNN Network (JGLTN) for efficient SISR, which is constructed by cascading modules composed of CNN layers and linear transformer layers. Specifically, in the CNN layer, our approach employs an inter-scale feature integration module (IFIM) to extract critical latent information across scales. Then, in the linear transformer layer, we design a joint feature-guided linear attention (JGLA). It jointly considers adjacent and extended regional features, dynamically assigning weights to convolutional kernels for contextual feature selection. This process garners multi-level contextual information, which is used to guide linear attention for effective information interaction. Moreover, we redesign the method of computing feature similarity within the self-attention, reducing its computational complexity to linear. Extensive experiments shows that our proposal outperforms state-of-the-art models while balancing performance and computational costs.

引用

页码：5765 / 5780

页数：16

共 50 条

[21] Focal Aggregation Transformer for Light Field Image Super-Resolution
Wang, Shunzhou
Lu, Yao
Xia, Wang
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII, 2025, 15038 : 524 - 538
[22] PCCFormer: Parallel coupled convolutional transformer for image super-resolution
Hou, Bowen
Li, Gongyan
VISUAL COMPUTER, 2024, 40 (12) : 8591 - 8602
[23] Efficient image super-resolution integration
Xu, Ke
Wang, Xin
Yang, Xin
He, Shengfeng
Zhang, Qiang
Yin, Baocai
Wei, Xiaopeng
Lau, Rynson W. H.
VISUAL COMPUTER, 2018, 34 (6-8) : 1065 - 1076
[24] Joint Back Projection and Residual Networks for Efficient Image Super-Resolution
Liu, Zhi-Song
Siu, Wan-Chi
Chan, Yui-Lam
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1054 - 1060
[25] Efficient image super-resolution integration
Ke Xu
Xin Wang
Xin Yang
Shengfeng He
Qiang Zhang
Baocai Yin
Xiaopeng Wei
Rynson W. H. Lau
The Visual Computer, 2018, 34 : 1065 - 1076
[26] Dual path features interaction network for efficient image super-resolution
Yang, Huimin
Xiao, Jingzhong
Zhang, Ji
Tian, Yu
Zhou, Xuchuan
NEUROCOMPUTING, 2024, 601
[27] RESIDUAL COMPONENT ESTIMATING CNN FOR IMAGE SUPER-RESOLUTION
Han, Xian-Hua
Sun, YongQing
Chen, Yen-Wei
2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2019), 2019, : 443 - 447
[28] Information sparsity guided transformer for multi-modal medical image super-resolution
Lu, Haotian
Mei, Jie
Qiu, Yu
Li, Yumeng
Hao, Fangwei
Xu, Jing
Tang, Lin
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 261
[29] A CNN-transformer hybrid network with selective fusion and dual attention for image super-resolution
Zhang, Chun
Wang, Jin
Shi, Yunhui
Yin, Baocai
Ling, Nam
MULTIMEDIA SYSTEMS, 2025, 31 (02)
[30] Image Super-Resolution Using Dilated Window Transformer
Park, Soobin
Choi, Yong Suk
IEEE ACCESS, 2023, 11 (60028-60039): : 60028 - 60039

← 1 2 3 4 5 →