A Diffusion Model Translator for Efficient Image-to-Image Translation

被引：3

作者：

Xia, Mengfei ^{[1
]}

Zhou, Yu ^{[1
]}

Yi, Ran ^{[2
]}

Liu, Yong-Jin ^{[1
]}

Wang, Wenping ^{[3
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, MOE Key Lab Pervas Comp, Beijing 100084, Peoples R China

[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[3] Texas A&M Univ, Dept Comp Sci & Comp Engn, College Stn, TX 77840 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Task analysis; Noise reduction; Diffusion models; Diffusion processes; Training; Computer science; Trajectory; image translation; deep learning; generative models;

D O I：

10.1109/TPAMI.2024.3435448

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applying diffusion models to image-to-image translation (I2I) has recently received increasing attention due to its practical applications. Previous attempts inject information from the source image into each denoising step for an iterative refinement, thus resulting in a time-consuming implementation. We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT), to accomplish I2I. Specifically, we first offer theoretical justification that in employing the pioneering DDPM work for the I2I task, it is both feasible and sufficient to transfer the distribution from one domain to another only at some intermediate step. We further observe that the translation performance highly depends on the chosen timestep for domain transfer, and therefore propose a practical strategy to automatically select an appropriate timestep for a given task. We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility. The comparisons show that our DMT surpasses existing methods in both quality and efficiency. Code is available at https://github.com/THU-LYJ-Lab/dmt.

引用

页码：10272 / 10283

页数：12

共 50 条

[41] Improving Generative Adversarial Networks for Patch-Based Unpaired Image-to-Image Translation
Boehland, Moritz
Bruch, Roman
Baeuerle, Simon
Rettenberger, Luca
Reischl, Markus
IEEE ACCESS, 2023, 11 : 127895 - 127906
[42] Generating Adversarial Examples in One Shot With Image-to-Image Translation GAN
Zhang, Weijia
IEEE ACCESS, 2019, 7 : 151103 - 151119
[43] Multi-feature contrastive learning for unpaired image-to-image translation
Yao Gou
Min Li
Yu Song
Yujie He
Litao Wang
Complex & Intelligent Systems, 2023, 9 : 4111 - 4122
[44] Unpaired image-to-image translation with improved two-dimensional feature
Hangyao Tu
Wanliang Wang
Jiachen Chen
Fei Wu
Guoqing Li
Multimedia Tools and Applications, 2022, 81 : 43851 - 43872
[45] Unpaired image-to-image translation with improved two-dimensional feature
Tu, Hangyao
Wang, Wanliang
Chen, Jiachen
Wu, Fei
Li, Guoqing
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (30) : 43851 - 43872
[46] Multi-feature contrastive learning for unpaired image-to-image translation
Gou, Yao
Li, Min
Song, Yu
He, Yujie
Wang, Litao
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 4111 - 4122
[47] Enhanced Unpaired Image-to-Image Translation via Transformation in Saliency Domain
Shibasaki, Kei
Ikehara, Masaaki
IEEE ACCESS, 2023, 11 : 137495 - 137505
[48] Multimodal image-to-image translation between domains with high internal variability
Wang, Jian
Lv, Jiancheng
Yang, Xue
Tang, Chenwei
Peng, Xi
SOFT COMPUTING, 2020, 24 (23) : 18173 - 18184
[49] Conditional Diffusion for SAR to Optical Image Translation
Bai, Xinyu
Pu, Xinyang
Xu, Feng
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
[50] Multimodal image-to-image translation between domains with high internal variability
Jian Wang
Jiancheng Lv
Xue Yang
Chenwei Tang
Xi Peng
Soft Computing, 2020, 24 : 18173 - 18184

← 1 2 3 4 5 →