SRDiff: Single image super-resolution with diffusion probabilistic models

被引：411

作者：

Li, Haoying ^{[1
]}

Yang, Yifan ^{[1
]}

Chang, Meng ^{[1
]}

Chen, Shiqi ^{[1
]}

Feng, Huajun ^{[1
]}

Xu, Zhihai ^{[1
]}

Li, Qi ^{[1
]}

Chen, Yueting ^{[1
]}

机构：

[1] Zhejiang Univ, State Key Lab Modern Opt Instrumentat, Hangzhou, Zhejiang, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 479卷

基金：

中国国家自然科学基金;

关键词：

Single image super-resolution; Diffusion probabilistic model; Diverse results; Deep learning; RESOLUTION; NETWORK;

D O I：

10.1016/j.neucom.2022.01.029

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Single image super-resolution (SISR) aims to reconstruct high-resolution (HR) images from given low resolution (LR) images. It is an ill-posed problem because one LR image corresponds to multiple HR images. Recently, learning-based SISR methods have greatly outperformed traditional methods. However, PSNR-oriented, GAN-driven and flow-based methods suffer from over-smoothing, mode collapse and large model footprint issues, respectively. To solve these problems, we propose a novel SISR diffusion probabilistic model (SRDiff), which is the first diffusion-based model for SISR. SRDiff is optimized with a variant of the variational bound on the data likelihood. Through a Markov chain, it can provide diverse and realistic super-resolution (SR) predictions by gradually transforming Gaussian noise into a super-resolution image conditioned on an LR input. In addition, we introduce residual prediction to the whole framework to speed up model convergence. Our extensive experiments on facial and general benchmarks (CelebA and DIV2K datasets) show that (1) SRDiff can generate diverse SR results with rich details and achieve competitive performance against other state-of-the-art methods, when given only one LR input; (2) SRDiff is easy to train with a small footprint(The word "footprint" in this paper represents "model size" (number of model parameters).); (3) SRDiff can perform flexible image manipulation operations, including latent space interpolation and content fusion. (c) 2022 Elsevier B.V. All rights reserved.

引用

页码：47 / 59

页数：13

共 55 条

[1]

[Anonymous], 2016, P 30 C NEUR INF PROC

[2]

Buhler M.C., ARXIV PREPRINT ARXIV

[3] Root mean square error (RMSE) or mean absolute error (MAE)? - Arguments against avoiding RMSE in the literature [J].

Chai, T. ;

Draxler, R. R. .

GEOSCIENTIFIC MODEL DEVELOPMENT, 2014, 7 (03) :1247-1250

[4]

Chen N., 2021, ICLR

[5] RBPNET: An asymptotic Residual Back-Projection Network for super-resolution of very low-resolution face image [J].

Chen, Xiaozhen ;

Wang, Xuebo ;

Lu, Yao ;

Li, Weiqi ;

Wang, Zijian ;

Huang, Zhuowei .

NEUROCOMPUTING, 2020, 376 :119-127

[6] Generative Adversarial Network-Based Image Super-Resolution Using Perceptual Content Losses [J].

Cheon, Manri ;

Kim, Jun-Hyuk ;

Choi, Jun-Ho ;

Lee, Jong-Seok .

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT V, 2019, 11133 :51-62

[7]

Dinh L., ARXIV PREPRINT ARXIV

[8] Image Super-Resolution Using Deep Convolutional Networks [J].

Dong, Chao ;

Loy, Chen Change ;

He, Kaiming ;

Tang, Xiaoou .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307

[9]

Dong Yang, 2017, Medical Image Computing and Computer Assisted Intervention MICCAI 2017. 20th International Conference. Proceedings: LNCS 10435, P507, DOI 10.1007/978-3-319-66179-7_58

[10] Image super-resolution based on residually dense distilled attention network q [J].

Dun, Yujie ;

Da, Zongyang ;

Yang, Shuai ;

Qian, Xueming .

NEUROCOMPUTING, 2021, 443 :47-57

← 1 2 3 4 5 6 →