GAN-Diffusion Relay Model: Advancing Semantic Image Synthesis

被引:0
作者
Jia, Jinyin [1 ,2 ]
Yang, Jun [1 ,2 ]
Fan, Anfei [1 ,2 ]
Chen, Junfan [1 ,2 ]
Cao, Peng [1 ,2 ]
Zhang, Chiyu [1 ,2 ]
Li, Wei [1 ,2 ]
机构
[1] Sichuan Normal Univ, Coll Comp Sci, Chengdu, Peoples R China
[2] Visual Comp & Virtual Real Key Lab Sichuan Prov, Chengdu, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT IV | 2025年 / 15034卷
基金
中国国家自然科学基金;
关键词
Semantic image synthesis; GAN; Diffusion model;
D O I
10.1007/978-981-97-8505-6_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic image synthesis, involves the transformation of semantic layouts into realistic images, is aimed at comprehending and leveraging given semantic information. Despite recent impressive advancements, challenges persist in terms of fidelity, semantic alignment, and training stability. To enhance the generation quality and semantic alignment in semantic image synthesis, we have reengineered the noise mapping and semantic space embedding, proposing a novel semantic image synthesis model, GAN-Diffusion Relay Model (GDRM), based on GAN and relay diffusion model. Extensive experiments on benchmark datasets validate the effectiveness of our proposed approach, achieving state-of-the-art performance in terms of fidelity (FID) and diversity (LPIPS).
引用
收藏
页码:392 / 405
页数:14
相关论文
共 39 条
[1]  
Balaji Yogesh, 2022, arXiv
[2]   Photographic Image Synthesis with Cascaded Refinement Networks [J].
Chen, Qifeng ;
Koltun, Vladlen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1520-1529
[3]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4]  
Dhariwal P, 2021, ADV NEUR IN, V34
[5]  
Eastwood Cian, 2018, ICLR
[6]   Generative Adversarial Networks [J].
Goodfellow, Ian ;
Pouget-Abadie, Jean ;
Mirza, Mehdi ;
Xu, Bing ;
Warde-Farley, David ;
Ozair, Sherjil ;
Courville, Aaron ;
Bengio, Yoshua .
COMMUNICATIONS OF THE ACM, 2020, 63 (11) :139-144
[7]  
Heusel M, 2017, ADV NEUR IN, V30
[8]  
Ho J, 2020, NIPS 20 PROC 34 INT, V33, P6840
[9]  
Ho J, 2022, J MACH LEARN RES, V23, P1
[10]  
Hoogeboom Emiel, P MACHINE LEARNING R