N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution

被引:118
作者
Choi, Haram [1 ]
Lee, Jeongmin [2 ]
Yang, Jihoon [1 ]
机构
[1] Sogang Univ, Dept Comp Sci & Engn, Seoul, South Korea
[2] LG Innotek, Seoul, South Korea
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
关键词
D O I
10.1109/CVPR52729.2023.00206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While some studies have proven that Swin Transformer (Swin) with window self-attention (WSA) is suitable for single image super-resolution (SR), the plain WSA ignores the broad regions when reconstructing high-resolution images due to a limited receptive field. In addition, many deep learning SR methods suffer from intensive computations. To address these problems, we introduce the N-Gram context to the low-level vision with Transformers for the first time. We define N-Gram as neighboring local windows in Swin, which differs from text analysis that views N-Gram as consecutive characters or words. N-Grams interact with each other by sliding-WSA, expanding the regions seen to restore degraded pixels. Using the N-Gram context, we propose NGswin, an efficient SR network with SCDP bottleneck taking multi-scale outputs of the hierarchical encoder. Experimental results show that NGswin achieves competitive performance while maintaining an efficient structure when compared with previous leading methods. Moreover, we also improve other Swin-based SR methods with the N-Gram context, thereby building an enhanced model: SwinIR-NG. Our improved SwinIR-NG outperforms the current best lightweight SR approaches and establishes state-of-the-art results. Codes are available at https://github.com/rami0205/NGramSwin.
引用
收藏
页码:2071 / 2081
页数:11
相关论文
共 66 条
[1]   NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study [J].
Agustsson, Eirikur ;
Timofte, Radu .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1122-1131
[2]   Efficient deep neural network for photo-realistic image super-resolution [J].
Ahn, Namhyuk ;
Kang, Byungkon ;
Sohn, Kyung-Ah .
PATTERN RECOGNITION, 2022, 127
[3]   Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network [J].
Ahn, Namhyuk ;
Kang, Byungkon ;
Sohn, Kyung-Ah .
COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :256-272
[4]  
[Anonymous], 2022, CVPR, DOI DOI 10.1109/CVPR52688.2022.00564
[5]  
[Anonymous], 2020, ECCV
[6]  
[Anonymous], 2002, INT C UN KNOWL LANG
[7]   Contour Detection and Hierarchical Image Segmentation [J].
Arbelaez, Pablo ;
Maire, Michael ;
Fowlkes, Charless ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (05) :898-916
[8]  
Ba J.L., 2016, Layer Normalization
[9]   Single image super-resolution based on directional variance attention network [J].
Behjati, Parichehr ;
Rodriguez, Pau ;
Fernandez, Carles ;
Hupont, Isabelle ;
Mehri, Armin ;
Gonzalez, Jordi .
PATTERN RECOGNITION, 2023, 133
[10]   Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding [J].
Bevilacqua, Marco ;
Roumy, Aline ;
Guillemot, Christine ;
Morel, Marie-Line Alberi .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,