A Diffusion Model Translator for Efficient Image-to-Image Translation

被引：3

作者：

Xia, Mengfei ^{[1
]}

Zhou, Yu ^{[1
]}

Yi, Ran ^{[2
]}

Liu, Yong-Jin ^{[1
]}

Wang, Wenping ^{[3
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, MOE Key Lab Pervas Comp, Beijing 100084, Peoples R China

[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[3] Texas A&M Univ, Dept Comp Sci & Comp Engn, College Stn, TX 77840 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Task analysis; Noise reduction; Diffusion models; Diffusion processes; Training; Computer science; Trajectory; image translation; deep learning; generative models;

D O I：

10.1109/TPAMI.2024.3435448

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applying diffusion models to image-to-image translation (I2I) has recently received increasing attention due to its practical applications. Previous attempts inject information from the source image into each denoising step for an iterative refinement, thus resulting in a time-consuming implementation. We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT), to accomplish I2I. Specifically, we first offer theoretical justification that in employing the pioneering DDPM work for the I2I task, it is both feasible and sufficient to transfer the distribution from one domain to another only at some intermediate step. We further observe that the translation performance highly depends on the chosen timestep for domain transfer, and therefore propose a practical strategy to automatically select an appropriate timestep for a given task. We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility. The comparisons show that our DMT surpasses existing methods in both quality and efficiency. Code is available at https://github.com/THU-LYJ-Lab/dmt.

引用

页码：10272 / 10283

页数：12

共 50 条

[21] Facial Feature Based Image-to-Image Translation Method
Kang, Shinjin
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (12): : 4835 - 4848
[22] Homomorphic Interpolation Network for Unpaired Image-to-Image Translation
Chen, Ying-Cong
Jia, Jiaya
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (05) : 2534 - 2547
[23] Photogenic Guided Image-to-Image Translation With Single Encoder
Oh, Rina
Gonsalves, T.
IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 : 624 - 635
[24] Improving Shape Deformation in Unsupervised Image-to-Image Translation
Gokaslan, Aaron
Ramanujan, Vivek
Ritchie, Daniel
Kim, Kwang In
Tompkin, James
COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 662 - 678
[25] Injecting-Diffusion: Inject Domain-Independent Contents into Diffusion Models for Unpaired Image-to-Image Translation
Li, Luying
Ma, Lizhuang
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 282 - 287
[26] UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation
Sun, Boyang
Mei, Yupeng
Yan, Ni
Chen, Yingyi
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (02)
[27] Unified Generative Adversarial Networks for Controllable Image-to-Image Translation
Tang, Hao
Liu, Hong
Sebe, Nicu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8916 - 8929
[28] Guided Image-to-Image Translation by Discriminator-Generator Communication
Cao, Yuanjiang
Yao, Lina
Pan, Le
Sheng, Quan Z.
Chang, Xiaojun
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1528 - 1538
[29] Knowledge Distillation Generative Adversarial Network for Image-to-Image Translation
Sub-r-pa, Chayanon
Chen, Rung-Ching
JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2024, 15 (08) : 896 - 902
[30] Image-to-Image Translation With Disentangled Latent Vectors for Face Editing
Dalva, Yusuf
Pehlivan, Hamza
Hatipoglu, Oyku Irmak
Moran, Cansu
Dundar, Aysegul
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14777 - 14788

← 1 2 3 4 5 →