Restormer: Efficient Transformer for High-Resolution Image Restoration

被引：2035

作者：

Zamir, Syed Waqas ^{[1
]}

Arora, Aditya ^{[1
]}

Khan, Salman ^{[2
]}

Hayat, Munawar ^{[2
,3
]}

Khan, Fahad Shahbaz ^{[2
,4
]}

Yang, Ming-Hsuan ^{[5
,6
,7
]}

机构：

[1] Incept Inst AI, Abu Dhabi, U Arab Emirates

[2] Mohamed Bin Zayed Univ AI, Abu Dhabi, U Arab Emirates

[3] Monash Univ, Clayton, Vic, Australia

[4] Linkoping Univ, Linkoping, Sweden

[5] Univ Calif Merced, Merced, CA USA

[6] Yonsei Univ, Seoul, South Korea

[7] Google Res, Mountain View, CA USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

澳大利亚研究理事会;

关键词：

D O I：

10.1109/CVPR52688.2022.00564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Since convolutional neural networks (CNNs) perform well at learning generalizable image priors from largescale data, these models have been extensively applied to image restoration and related tasks. Recently, another class of neural architectures, Transformers, have shown significant performance gains on natural language and high-level vision tasks. While the Transformer model mitigates the shortcomings of CNNs (i.e., limited receptive field and inadaptability to input content), its computational complexity grows quadratically with the spatial resolution, therefore making it infeasible to apply to most image restoration tasks involving high-resolution images. In this work, we propose an efficient Transformer model by making several key designs in the building blocks (multi-head attention and feed-forward network) such that it can capture long-range pixel interactions, while still remaining applicable to large images. Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks, including image deraining, single-image motion deblurring, defocus deblurring (single-image and dual-pixel data), and image denoising (Gaussian grayscale/color denoising, and real image denoising). The source code and pre-trained models are available at https://github.com/swz30/Restormer.

引用

页码：5718 / 5729

页数：12

共 99 条

[1] A High-Quality Denoising Dataset for Smartphone Cameras [J].

Abdelhamed, Abdelrahman ;

Lin, Stephen ;

Brown, Michael S. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1692-1700

[2]

Abdelhamed Abdelrahman, 2019, CVPR WORKSH

[3] Defocus Deblurring Using Dual-Pixel Data [J].

Abuolaim, Abdullah ;

Brown, Michael S. .

COMPUTER VISION - ECCV 2020, PT X, 2020, 12355 :111-126

[4] NTIRE 2021 Challenge for Defocus Deblurring Using Dual-pixel Images: Methods and Results [J].

Abuolaim, Abdullah ;

Timofte, Radu ;

Brown, Michael S. ;

Zhang, Dafeng ;

Wang, Xiaobing ;

Zamir, Syed Waqas ;

Arora, Aditya ;

Khan, Salman ;

Hayat, Munawar ;

Khan, Fahad Shahbaz ;

Shao, Ling ;

Liu, Shuai ;

Lei, Lei ;

Feng, Chaoyu ;

Xiong, Zhiwei ;

Xiao, Zeyu ;

Xu, Ruikang ;

Zhu, Yunan ;

Liu, Dong ;

Vo, Tu ;

Miao, Si ;

Shah, Nisarg A. ;

Liang, Pengwei ;

Zhong, Zhiwei ;

Hu, Xingyu ;

Chen, Yiqun ;

Li, Chenghua ;

Bai, Xiaoying ;

Zhang, Chi ;

Yao, Yiheng ;

Gang, Ruipeng ;

Nathan, Sabari ;

Ragavendran, Thangavelu ;

Srinija, Venkatakrishnan ;

Srivatsav, Venkatakrishnan .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :578-587

[5]

Abuolaim Abdullah, 2021, ICCV

[6]

[Anonymous], 2001, 8 IEEE INT C COMPUTE, DOI [DOI 10.1109/ICCV.2001.937655, 10.1109/ICCV.2001.937655]

[7]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00397

[8]

[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00583

[9]

[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00277

[10]

[Anonymous], 2018, CVPR, DOI DOI 10.1163/9789004385580002

← 1 2 3 4 5 6 7 8 9 10 →