SRDiff: Single image super-resolution with diffusion probabilistic models

被引:291
作者
Li, Haoying [1 ]
Yang, Yifan [1 ]
Chang, Meng [1 ]
Chen, Shiqi [1 ]
Feng, Huajun [1 ]
Xu, Zhihai [1 ]
Li, Qi [1 ]
Chen, Yueting [1 ]
机构
[1] Zhejiang Univ, State Key Lab Modern Opt Instrumentat, Hangzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Single image super-resolution; Diffusion probabilistic model; Diverse results; Deep learning; RESOLUTION; NETWORK;
D O I
10.1016/j.neucom.2022.01.029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single image super-resolution (SISR) aims to reconstruct high-resolution (HR) images from given low resolution (LR) images. It is an ill-posed problem because one LR image corresponds to multiple HR images. Recently, learning-based SISR methods have greatly outperformed traditional methods. However, PSNR-oriented, GAN-driven and flow-based methods suffer from over-smoothing, mode collapse and large model footprint issues, respectively. To solve these problems, we propose a novel SISR diffusion probabilistic model (SRDiff), which is the first diffusion-based model for SISR. SRDiff is optimized with a variant of the variational bound on the data likelihood. Through a Markov chain, it can provide diverse and realistic super-resolution (SR) predictions by gradually transforming Gaussian noise into a super-resolution image conditioned on an LR input. In addition, we introduce residual prediction to the whole framework to speed up model convergence. Our extensive experiments on facial and general benchmarks (CelebA and DIV2K datasets) show that (1) SRDiff can generate diverse SR results with rich details and achieve competitive performance against other state-of-the-art methods, when given only one LR input; (2) SRDiff is easy to train with a small footprint(The word "footprint" in this paper represents "model size" (number of model parameters).); (3) SRDiff can perform flexible image manipulation operations, including latent space interpolation and content fusion. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:47 / 59
页数:13
相关论文
共 55 条
  • [1] [Anonymous], Image-to-Image Translation with Conditional Adversarial Networks
  • [2] Buhler M.C., ARXIV PREPRINT ARXIV
  • [3] Root mean square error (RMSE) or mean absolute error (MAE)? - Arguments against avoiding RMSE in the literature
    Chai, T.
    Draxler, R. R.
    [J]. GEOSCIENTIFIC MODEL DEVELOPMENT, 2014, 7 (03) : 1247 - 1250
  • [4] Chen N., 2021, ICLR
  • [5] RBPNET: An asymptotic Residual Back-Projection Network for super-resolution of very low-resolution face image
    Chen, Xiaozhen
    Wang, Xuebo
    Lu, Yao
    Li, Weiqi
    Wang, Zijian
    Huang, Zhuowei
    [J]. NEUROCOMPUTING, 2020, 376 : 119 - 127
  • [6] Generative Adversarial Network-Based Image Super-Resolution Using Perceptual Content Losses
    Cheon, Manri
    Kim, Jun-Hyuk
    Choi, Jun-Ho
    Lee, Jong-Seok
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT V, 2019, 11133 : 51 - 62
  • [7] Dinh L., ARXIV PREPRINT ARXIV
  • [8] Image Super-Resolution Using Deep Convolutional Networks
    Dong, Chao
    Loy, Chen Change
    He, Kaiming
    Tang, Xiaoou
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) : 295 - 307
  • [9] Dong Yang, 2017, Medical Image Computing and Computer Assisted Intervention MICCAI 2017. 20th International Conference. Proceedings: LNCS 10435, P507, DOI 10.1007/978-3-319-66179-7_58
  • [10] Image super-resolution based on residually dense distilled attention network q
    Dun, Yujie
    Da, Zongyang
    Yang, Shuai
    Qian, Xueming
    [J]. NEUROCOMPUTING, 2021, 443 : 47 - 57