A Diffusion Model Translator for Efficient Image-to-Image Translation

被引:3
|
作者
Xia, Mengfei [1 ]
Zhou, Yu [1 ]
Yi, Ran [2 ]
Liu, Yong-Jin [1 ]
Wang, Wenping [3 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, MOE Key Lab Pervas Comp, Beijing 100084, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[3] Texas A&M Univ, Dept Comp Sci & Comp Engn, College Stn, TX 77840 USA
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Task analysis; Noise reduction; Diffusion models; Diffusion processes; Training; Computer science; Trajectory; image translation; deep learning; generative models;
D O I
10.1109/TPAMI.2024.3435448
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Applying diffusion models to image-to-image translation (I2I) has recently received increasing attention due to its practical applications. Previous attempts inject information from the source image into each denoising step for an iterative refinement, thus resulting in a time-consuming implementation. We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT), to accomplish I2I. Specifically, we first offer theoretical justification that in employing the pioneering DDPM work for the I2I task, it is both feasible and sufficient to transfer the distribution from one domain to another only at some intermediate step. We further observe that the translation performance highly depends on the chosen timestep for domain transfer, and therefore propose a practical strategy to automatically select an appropriate timestep for a given task. We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility. The comparisons show that our DMT surpasses existing methods in both quality and efficiency. Code is available at https://github.com/THU-LYJ-Lab/dmt.
引用
收藏
页码:10272 / 10283
页数:12
相关论文
共 50 条
  • [41] Improving Generative Adversarial Networks for Patch-Based Unpaired Image-to-Image Translation
    Boehland, Moritz
    Bruch, Roman
    Baeuerle, Simon
    Rettenberger, Luca
    Reischl, Markus
    IEEE ACCESS, 2023, 11 : 127895 - 127906
  • [42] Generating Adversarial Examples in One Shot With Image-to-Image Translation GAN
    Zhang, Weijia
    IEEE ACCESS, 2019, 7 : 151103 - 151119
  • [43] Multi-feature contrastive learning for unpaired image-to-image translation
    Yao Gou
    Min Li
    Yu Song
    Yujie He
    Litao Wang
    Complex & Intelligent Systems, 2023, 9 : 4111 - 4122
  • [44] Unpaired image-to-image translation with improved two-dimensional feature
    Hangyao Tu
    Wanliang Wang
    Jiachen Chen
    Fei Wu
    Guoqing Li
    Multimedia Tools and Applications, 2022, 81 : 43851 - 43872
  • [45] Unpaired image-to-image translation with improved two-dimensional feature
    Tu, Hangyao
    Wang, Wanliang
    Chen, Jiachen
    Wu, Fei
    Li, Guoqing
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (30) : 43851 - 43872
  • [46] Multi-feature contrastive learning for unpaired image-to-image translation
    Gou, Yao
    Li, Min
    Song, Yu
    He, Yujie
    Wang, Litao
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 4111 - 4122
  • [47] Enhanced Unpaired Image-to-Image Translation via Transformation in Saliency Domain
    Shibasaki, Kei
    Ikehara, Masaaki
    IEEE ACCESS, 2023, 11 : 137495 - 137505
  • [48] Multimodal image-to-image translation between domains with high internal variability
    Wang, Jian
    Lv, Jiancheng
    Yang, Xue
    Tang, Chenwei
    Peng, Xi
    SOFT COMPUTING, 2020, 24 (23) : 18173 - 18184
  • [49] Conditional Diffusion for SAR to Optical Image Translation
    Bai, Xinyu
    Pu, Xinyang
    Xu, Feng
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [50] Multimodal image-to-image translation between domains with high internal variability
    Jian Wang
    Jiancheng Lv
    Xue Yang
    Chenwei Tang
    Xi Peng
    Soft Computing, 2020, 24 : 18173 - 18184