DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

被引:0
|
作者
Xia, Bin [1 ]
Zhang, Yulun [2 ]
Wang, Shiyin [3 ]
Wang, Yitong [3 ]
Wu, Xinglong [3 ]
Tian, Yapeng [4 ]
Yang, Wenming [1 ]
Timotfe, Radu [5 ,6 ]
Van Gool, Luc [2 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
[2] Swiss Fed Inst Technol, Comp Vis Lab, CH-8092 Zurich, Switzerland
[3] Bytedance Inc, Shenzhen 518055, Peoples R China
[4] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75080 USA
[5] Univ Wurzburg, Comp Vis Lab, IFI, D-97070 Wurzburg, Germany
[6] Univ Wurzburg, CAIDAS, D-97070 Wurzburg, Germany
基金
中国国家自然科学基金;
关键词
Intellectual property; Noise reduction; Runtime; Image synthesis; Transformers; Semantic segmentation; Image restoration; Diffusion processes; Dense prediction; diffusion model; image restoration; image-to-image translation; inpainting; motion deblurring; super-resolution; SUPERRESOLUTION; RESTORATION; NETWORK;
D O I
10.1109/TPAMI.2024.3498003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Diffusion Model (DM) has emerged as the SOTA approach for image synthesis. However, the existing DM cannot perform well on some image-to-image translation (I2I) tasks. Different from image synthesis, some I2I tasks, such as super-resolution, require generating results in accordance with GT images. Traditional DMs for image synthesis require extensive iterations and large denoising models to estimate entire images, which gives their strong generative ability but also leads to artifacts and inefficiency for I2I. To tackle this challenge, we propose a simple, efficient, and powerful DM framework for I2I, called DiffI2I. Specifically, DiffI2I comprises three key components: a compact I2I prior extraction network (CPEN), a dynamic I2I transformer (DI2Iformer), and a denoising network. We train DiffI2I in two stages: pretraining and DM training. For pretraining, GT and input images are fed into CPEN(S1 )to capture a compact I2I prior representation (IPR) guiding DI2Iformer. In the second stage, the DM is trained to only use the input images to estimate the same IRP as CPENS1. Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations. Through extensive experiments on various I2I tasks, we demonstrate that DiffI2I achieves SOTA performance while significantly reducing computational burdens.
引用
收藏
页码:1578 / 1593
页数:16
相关论文
共 50 条
  • [31] Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors
    Mao, Qi
    Tseng, Hung-Yu
    Lee, Hsin-Ying
    Huang, Jia-Bin
    Ma, Siwei
    Yang, Ming-Hsuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (02) : 517 - 549
  • [32] Background-focused contrastive learning for unpaired image-to-image translation
    Shao, Mingwen
    Han, Minggui
    Meng, Lingzhuang
    Liu, Fukang
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (04)
  • [33] Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors
    Qi Mao
    Hung-Yu Tseng
    Hsin-Ying Lee
    Jia-Bin Huang
    Siwei Ma
    Ming-Hsuan Yang
    International Journal of Computer Vision, 2022, 130 : 517 - 549
  • [34] Unsupervised Image-to-Image Translation via Pre-Trained StyleGAN2 Network
    Huang, Jialu
    Liao, Jing
    Kwong, Sam
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1435 - 1448
  • [35] Unpaired Image-to-Image Translation Using Negative Learning for Noisy Patches
    Hung, Yu-Hsiang
    Tan, Julianne
    Huang, Tai-Ming
    Hsu, Shang-Che
    Chen, Yi-Ling
    Hua, Kai-Lung
    IEEE MULTIMEDIA, 2022, 29 (04) : 59 - 68
  • [36] Hierarchical Detailed Intermediate Supervision for Image-to-Image Translation
    Wang, Jianbo
    Huang, Haozhi
    Shen, Li
    Wang, Xuan
    Yamasaki, Toshihiko
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (12) : 2085 - 2096
  • [37] Hubble Meets Webb: Image-to-Image Translation in Astronomy
    Kinakh, Vitaliy
    Belousov, Yury
    Quetant, Guillaume
    Drozdova, Mariia
    Holotyak, Taras
    Schaerer, Daniel
    Voloshynovskiy, Slava
    SENSORS, 2024, 24 (04)
  • [38] GAIT: GRADIENT ADJUSTED UNSUPERVISED IMAGE-TO-IMAGE TRANSLATION
    Akkaya, Ibrahim Batuhan
    Halici, Ugur
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1591 - 1595
  • [39] Homomorphic Interpolation Network for Unpaired Image-to-Image Translation
    Chen, Ying-Cong
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (05) : 2534 - 2547
  • [40] Diffusion Models for Cross-Domain Image-to-Image Translation with Paired and Partially Paired Datasets
    Bell, Trisk
    Li, Dan
    2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024, 2024, : 38 - 45