DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

被引：0

作者：

Xia, Bin ^{[1
]}

Zhang, Yulun ^{[2
]}

Wang, Shiyin ^{[3
]}

Wang, Yitong ^{[3
]}

Wu, Xinglong ^{[3
]}

Tian, Yapeng ^{[4
]}

Yang, Wenming ^{[1
]}

Timotfe, Radu ^{[5
,6
]}

Van Gool, Luc ^{[2
]}

机构：

[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China

[2] Swiss Fed Inst Technol, Comp Vis Lab, CH-8092 Zurich, Switzerland

[3] Bytedance Inc, Shenzhen 518055, Peoples R China

[4] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75080 USA

[5] Univ Wurzburg, Comp Vis Lab, IFI, D-97070 Wurzburg, Germany

[6] Univ Wurzburg, CAIDAS, D-97070 Wurzburg, Germany

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2025年 / 47卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Intellectual property; Noise reduction; Runtime; Image synthesis; Transformers; Semantic segmentation; Image restoration; Diffusion processes; Dense prediction; diffusion model; image restoration; image-to-image translation; inpainting; motion deblurring; super-resolution; SUPERRESOLUTION; RESTORATION; NETWORK;

D O I：

10.1109/TPAMI.2024.3498003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Diffusion Model (DM) has emerged as the SOTA approach for image synthesis. However, the existing DM cannot perform well on some image-to-image translation (I2I) tasks. Different from image synthesis, some I2I tasks, such as super-resolution, require generating results in accordance with GT images. Traditional DMs for image synthesis require extensive iterations and large denoising models to estimate entire images, which gives their strong generative ability but also leads to artifacts and inefficiency for I2I. To tackle this challenge, we propose a simple, efficient, and powerful DM framework for I2I, called DiffI2I. Specifically, DiffI2I comprises three key components: a compact I2I prior extraction network (CPEN), a dynamic I2I transformer (DI2Iformer), and a denoising network. We train DiffI2I in two stages: pretraining and DM training. For pretraining, GT and input images are fed into CPEN(S1 )to capture a compact I2I prior representation (IPR) guiding DI2Iformer. In the second stage, the DM is trained to only use the input images to estimate the same IRP as CPENS1. Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations. Through extensive experiments on various I2I tasks, we demonstrate that DiffI2I achieves SOTA performance while significantly reducing computational burdens.

引用

页码：1578 / 1593

页数：16

共 50 条

[1] A Diffusion Model Translator for Efficient Image-to-Image Translation
Xia, Mengfei
Zhou, Yu
Yi, Ran
Liu, Yong-Jin
Wang, Wenping
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10272 - 10283
[2] Dissecting and Mitigating Semantic Discrepancy in Stable Diffusion for Image-to-Image Translation
Yuan, Yifan
Yang, Guanqun
Wang, James Z.
Zhang, Hui
Shan, Hongming
Wang, Fei-Yue
Zhang, Junping
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2025, 12 (04) : 705 - 718
[3] Unpaired Image-to-Image Translation with Diffusion Adversarial Network
Tu, Hangyao
Wang, Zheng
Zhao, Yanwei
MATHEMATICS, 2024, 12 (20)
[4] Vector Quantized Image-to-Image Translation
Chen, Yu-Jie
Cheng, Shin-I
Chiu, Wei-Chen
Tseng, Hung-Yu
Lee, Hsin-Ying
COMPUTER VISION - ECCV 2022, PT XVI, 2022, 13676 : 440 - 456
[5] Efficient Diffusion Model for Image Restoration by Residual Shifting
Yue, Zongsheng
Wang, Jianyi
Loy, Chen Change
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 116 - 130
[6] Hypercomplex Image-to-Image Translation
Grassucci, Eleonora
Sigillo, Luigi
Uncini, Aurelio
Comminiello, Danilo
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[7] Towards annotation-efficient segmentation via image-to-image translation
Vorontsov, Eugene
Molchanov, Pavlo
Gazda, Matej
Beckham, Christopher
Kautz, Jan
Kadoury, Samuel
MEDICAL IMAGE ANALYSIS, 2022, 82
[8] Multimodal Unsupervised Image-to-Image Translation
Huang, Xun
Liu, Ming-Yu
Belongie, Serge
Kautz, Jan
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 179 - 196
[9] Image-to-Image Translation: Methods and Applications
Pang, Yingxue
Lin, Jianxin
Qin, Tao
Chen, Zhibo
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 3859 - 3881
[10] SEMI2I: SEMANTICALLY CONSISTENT IMAGE-TO-IMAGE TRANSLATION FOR DOMAIN ADAPTATION OF REMOTE SENSING DATA
Tasar, Onur
Happy, S. L.
Tarabalka, Yuliya
Alliez, Pierre
IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 1837 - 1840

← 1 2 3 4 5 →