Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution

被引：28

作者：

Chen, Hao-Wei ^{[1
,2
]}

Xu, Yu-Syuan ^{[2
]}

Hong, Min-Fong ^{[2
]}

Tsai, Yi-Min ^{[2
]}

Kuo, Hsien-Kai ^{[2
]}

Lee, Chun-Yi ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Elsa Lab, Hsinchu, Taiwan

[2] MediaTek Inc, Hsinchu, Taiwan

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.01751

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Implicit neural representation has recently shown a promising ability in representing images with arbitrary resolutions. In this paper, we present a Local Implicit Transformer (LIT), which integrates the attention mechanism and frequency encoding technique into a local implicit image function. We design a cross-scale local attention block to effectively aggregate local features and a local frequency encoding block to combine positional encoding with Fourier domain information for constructing high-resolution images. To further improve representative power, we propose a Cascaded LIT (CLIT) that exploits multi-scale features, along with a cumulative training strategy that gradually increases the upsampling scales during training. We have conducted extensive experiments to validate the effectiveness of these components and analyze various training strategies. The qualitative and quantitative results demonstrate that LIT and CLIT achieve favorable results and outperform the prior works in arbitrary super-resolution tasks.

引用

页码：18257 / 18267

页数：11

共 53 条

[1] NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study [J].

Agustsson, Eirikur ;

Timofte, Radu .

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1122-1131

[2]

[Anonymous], P IEEE C COMP VIS PA

[3] SAL: Sign Agnostic Learning of Shapes from Raw Data [J].

Atzmon, Matan ;

Lipman, Yaron .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2562-2571

[4] Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields [J].

Barron, Jonathan T. ;

Mildenhall, Ben ;

Tancik, Matthew ;

Hedman, Peter ;

Martin-Brualla, Ricardo ;

Srinivasan, Pratul P. .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :5835-5844

[5]

Basri R, 2020, PR MACH LEARN RES, V119

[6] Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding [J].

Bevilacqua, Marco ;

Roumy, Aline ;

Guillemot, Christine ;

Morel, Marie-Line Alberi .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,

[7] Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction [J].

Chabra, Rohan ;

Lenssen, Jan E. ;

Ilg, Eddy ;

Schmidt, Tanner ;

Straub, Julian ;

Lovegrove, Steven ;

Newcombe, Richard .

COMPUTER VISION - ECCV 2020, PT XXIX, 2020, 12374 :608-625

[8] Pre-Trained Image Processing Transformer [J].

Chen, Hanting ;

Wang, Yunhe ;

Guo, Tianyu ;

Xu, Chang ;

Deng, Yiping ;

Liu, Zhenhua ;

Ma, Siwei ;

Xu, Chunjing ;

Xu, Chao ;

Gao, Wen .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12294-12305

[9]

Chen X., 2022, ABS220504437 CORR

[10] Learning Continuous Image Representation with Local Implicit Image Function [J].

Chen, Yinbo ;

Liu, Sifei ;

Wang, Xiaolong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8624-8634

← 1 2 3 4 5 6 →