Multi-granularity Transformer for Image Super-Resolution

被引：0

作者：

Zhuge, Yunzhi ^{[1
]}

Jia, Xu ^{[2
]}

机构：

[1] Univ Adelaide, Adelaide, SA, Australia

[2] Dalian Univ Technol, Sch Artificial Intelligence, Dalian, Peoples R China

来源：

COMPUTER VISION - ACCV 2022, PT III | 2023年 / 13843卷

关键词：

SPARSE;

D O I：

10.1007/978-3-031-26313-2_9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, transformers have made great success in computer vision. Thus far, most of those works focus on high-level tasks, e.g., image classification and object detection, and fewer attempts were made to solve low-level problems. In this work, we tackle image super-resolution. Specifically, transformer architectures with multi-granularity transformer groups are explored for complementary information interaction, to improve the accuracy of super-resolution. We exploit three transformer patterns, i.e., the window transformers, dilated transformers and global transformers. We further investigate the combination of them and propose a Multi-granularity Transformer (MugFormer). Specifically, the window transformer layer is aggregated with other transformer layers to compose three transformer groups, namely, Local Transformer Group, Dilated Transformer Group and Global Transformer Group, which efficiently aggregate both local and global information for accurate reconstruction. Extensive experiments on five benchmark datasets demonstrate that our MugFormer performs favorably against state-of-the-art methods in terms of both quantitative and qualitative results.

引用

页码：138 / 154

页数：17

共 53 条

[1] Arnab A., 2021, arXiv
[2] Ben Niu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12357), P191, DOI 10.1007/978-3-030-58610-2_12
[3] Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding
Bevilacqua, Marco
Roumy, Aline
Guillemot, Christine
Morel, Marie-Line Alberi
[J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
[4] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[5] GLiT: Neural Architecture Search for Global and Local Image Transformer
Chen, Boyu
Li, Peixia
Li, Chuming
Li, Baopu
Bai, Lei
Lin, Chen
Sun, Ming
Yan, Junjie
Ouyang, Wanli
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12 - 21
[6] Pre-Trained Image Processing Transformer
Chen, Hanting
Wang, Yunhe
Guo, Tianyu
Xu, Chang
Deng, Yiping
Liu, Zhenhua
Ma, Siwei
Xu, Chunjing
Xu, Chao
Gao, Wen
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12294 - 12305
[7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[8] Chu XX, 2021, ADV NEUR IN
[9] Chu XX, 2021, Arxiv, DOI arXiv:2102.10882
[10] Second-order Attention Network for Single Image Super-Resolution
Dai, Tao
Cai, Jianrui
Zhang, Yongbing
Xia, Shu-Tao
Zhang, Lei
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11057 - 11066

← 1 2 3 4 5 6 →