SR-BigGAN: lightweight image super-resolution with priors

被引:0
作者
Kumar, Deepak [1 ]
Rao, Harshitha Srinivas [2 ]
Kumar, Chetan [3 ]
Shao, Ming [4 ]
机构
[1] Worcester Polytech Inst, Dept Comp Sci, 100 Inst Rd, Worcester, MA 01609 USA
[2] Univ Massachusetts, Program Data Sci, 285 Old Westport Rd, Dartmouth, MA 02747 USA
[3] Old Dominion Univ, Thomas Jefferson Natl Accelerator Facil Joint Inst, 1070 Univ Blvd, Portsmouth, VA 23703 USA
[4] Univ Massachusetts, Miner Sch Comp & Informat Sci, 1 Univ Ave, Lowell, MA 01854 USA
关键词
Super resolution; Knowledge distillation; GAN; GENERATIVE ADVERSARIAL NETWORKS; NEURAL-NETWORK;
D O I
10.1007/s00138-025-01713-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent progress in the field of high-fidelity image synthesis using GANs has shown appealing outcomes, which motivates a series of successful image super-resolution (SR) works. However, most GAN-based SR models apply plain GAN models and are cumbersome compared to CNN-based SR models. In addition, the very advantage of conditional GAN has not yet been explored under the context of SR. In this paper, we develop a lightweight SR-BigGAN with priors for single-image super-resolution (SISR). First, our new model is an extension of BigGAN tailored to deep SR pipeline, retaining both generator and discriminator architectures, but with modifications to accommodate SR tasks. Second, prior knowledge defined as class labels from the low-resolution images, is fully leveraged through the conditional generative model to refine the SR process. Third, the lightweight nature of the model is achieved through knowledge distillation, focusing on reduced computational complexity and memory usage, making it the first practice of this kind in GAN-based SISR modeling. Extensive experiments on DIV2K, Pascal, mini-ImageNet, and SR benchmarks including Set5 and Set14 in an attempt to compare the Structural Similarity (SSIM), Peak Signal-to-Noise Ratio (PSNR) with the state-of-the-art models have shown appealing results. Our model achieves an average PSNR of 34.99 and SSIM of 0.791 across these datasets, demonstrating quantitative improvements over existing methods. The generated high-resolution images offer both perceptual enhancement and improved classification results. Additionally, explicit comparisons with GAN-based SR techniques such as ESRGAN and SRGAN highlight the superiority of our approach in both fidelity and efficiency. In particular, we achieve an average of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox{PSNR}=34.99$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox{SSIM}=0.791$$\end{document} on several testing datasets.
引用
收藏
页数:18
相关论文
共 65 条
[1]  
Vasu S., Thekke Madam N., Rajagopalan A., Analyzing perception–distortion tradeoff using enhanced perceptual super-resolution network, Proceedings of the European Conference on Computer Vision (ECCV) Workshops, (2018)
[2]  
Wu H., Zheng S., Zhang J., Huang K., Gp-gan: towards realistic high-resolution image blending, Proceedings of the 27th ACM International Conference on Multimedia, pp. 2487-2495, (2019)
[3]  
Taigman Y., Polyak A., Wolf L., Unsupervised Cross-Domain Image Generation, (2016)
[4]  
Jin Y., Zhang J., Li M., Tian Y., Zhu H., Fang Z., Towards the Automatic Anime Characters Creation with Generative Adversarial Networks, (2017)
[5]  
Yoo D., Kim N., Park S., Paek A.S., Kweon I.S., Pixel-level domain transfer, European Conference on Computer Vision, pp. 517-532, (2016)
[6]  
Zhang H., Xu T., Li H., Zhang S., Wang X., Huang X., Metaxas D.N., Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks, Proceedings of the IEEE International Conference on Computer Vision, pp. 5907-5915, (2017)
[7]  
Isola P., Zhu J.-Y., Zhou T., Efros A.A., Image-to-image translation with conditional adversarial networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125-1134, (2017)
[8]  
Wolterink J.M., Leiner T., Viergever M.A., Isgum I., Generative adversarial networks for noise reduction in low-dose CT, IEEE Trans. Med. Imaging, 36, 12, pp. 2536-2545, (2017)
[9]  
Qin C., Schlemper J., Caballero J., Price A.N., Hajnal J.V., Rueckert D., Convolutional recurrent neural networks for dynamic MR image reconstruction, IEEE Trans. Med. Imaging, 38, 1, pp. 280-290, (2018)
[10]  
Gong Y., Shan H., Teng Y., Tu N., Li M., Liang G., Wang G., Wang S., Parameter-transferred Wasserstein generative adversarial network (PT-WGAN) for low-dose pet image denoising, IEEE Trans. Radiat. Plasma Med. Sci, 5, 2, pp. 213-223, (2020)