Transformer-based image super-resolution and its lightweight

被引：2

作者：

Zhang, Dongxiao ^{[1
]}

Qi, Tangyao ^{[1
]}

Gao, Juhao ^{[1
]}

机构：

[1] Jimei Univ, Sch Sci, Xiamen 361021, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2024年 / 83卷 / 26期

关键词：

Super-resolution; Transformer; Lightweight; Content-based early-stopping; Up and down iteration; NETWORK; RESOLUTION;

D O I：

10.1007/s11042-024-18140-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transformer has shown remarkable performance improvements over convolutional neural network (CNN) in natural language processing and high-level vision tasks. However, its application in low-level vision tasks, such as single image super-resolution (SISR), is still under-explored. In this paper, we introduce an up-down iterative algorithm and design a residual down and up Transformer block (RDUTB) in the Transformer framework. Then we propose a network for SISR based on RDUTB, which can effectively reconstruct low resolution (LR) images. Furthermore, to address the increasing demand for SISR models that can run on low-end mobile devices, we simplify the proposed model structure and adopt a content-based early-stopping strategy in the proposed SISR model to reduce the parameters and accelerate the reconstruction process while maintaining high quality. Experimental results show that our proposed Transformer-based SISR network and its lightweight version achieve superior performance over both traditional CNN-based SISR methods and some of the latest Transformer-based SISR methods.

引用

页码：68625 / 68649

页数：25

共 54 条

[41]

Xie EZ, 2021, ADV NEUR IN, V34

[42] Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network [J].

Yamanaka, Jin ;

Kuwashima, Shigesumi ;

Kurita, Takio .

NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 :217-225

[43] Super resolution reconstruction of CT images based on multi-scale attention mechanism [J].

Yin, Jian ;

Xu, Shao-Hua ;

Du, Yan-Bin ;

Jia, Rui-Sheng .

MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (15) :22651-22667

[44] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [J].

Yuan, Li ;

Chen, Yunpeng ;

Wang, Tao ;

Yu, Weihao ;

Shi, Yujun ;

Jiang, Zihang ;

Tay, Francis E. H. ;

Feng, Jiashi ;

Yan, Shuicheng .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :538-547

[45] Restormer: Efficient Transformer for High-Resolution Image Restoration [J].

Zamir, Syed Waqas ;

Arora, Aditya ;

Khan, Salman ;

Hayat, Munawar ;

Khan, Fahad Shahbaz ;

Yang, Ming-Hsuan .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5718-5729

[46]

Zeyde R., 2012, INT C CURV SURF, P711, DOI DOI 10.1007/978-3-642-27413-847

[47]

Zhang J, 2023, P INT C LEARNING REP, P1

[48] Learning a Single Convolutional Super-Resolution Network for Multiple Degradations [J].

Zhang, Kai ;

Zuo, Wangmeng ;

Zhang, Lei .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3262-3271

[49] Efficient Long-Range Attention Network for Image Super-Resolution [J].

Zhang, Xindong ;

Zeng, Hui ;

Guo, Shi ;

Zhang, Lei .

COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 :649-667

[50] Residual Dense Network for Image Restoration [J].

Zhang, Yulun ;

Tian, Yapeng ;

Kong, Yu ;

Zhong, Bineng ;

Fu, Yun .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (07) :2480-2495

← 1 2 3 4 5 6 →