Multi-Semantic Fusion Generative Adversarial Network for Text-to-Image Generation

被引:2
作者
Huang, Pingda [1 ]
Liu, Yedan [1 ]
Fu, Chunjiang [1 ]
Zhao, Liang [1 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Dalian 116600, Peoples R China
来源
2023 IEEE 8TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS, ICBDA | 2023年
基金
中国国家自然科学基金;
关键词
text-to-image; GANs; feature fusion; BERT;
D O I
10.1109/ICBDA57405.2023.10104850
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-to-image synthesis always faces two challenges: image quality and image-text alignment. Existing methods mainly use a single sentence to synthesize images, which are challenging to extract adequate semantic features, resulting in the generated images being far apart from ground-truth images. In this paper, we propose a novel Multi-Semantic Fusion Generative Adversarial Network. Our model can fuse the same semantics from different sentences and preserve their unique semantics to generate accurate images. In addition, we have designed a multi-sentence joint discriminator to ensure that the generated images match all sentences. Experiments on CUB and MSCOCO datasets demonstrate that our model has significant advantages.
引用
收藏
页码:159 / 164
页数:6
相关论文
共 23 条
[21]   StackGAN plus plus : Realistic Image Synthesis with Stacked Generative Adversarial Networks [J].
Zhang, Han ;
Xu, Tao ;
Li, Hongsheng ;
Zhang, Shaoting ;
Wang, Xiaogang ;
Huang, Xiaolei ;
Metaxas, Dimitris N. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (08) :1947-1962
[22]   StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks [J].
Zhang, Han ;
Xu, Tao ;
Li, Hongsheng ;
Zhang, Shaoting ;
Wang, Xiaogang ;
Huang, Xiaolei ;
Metaxas, Dimitris .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5908-5916
[23]   DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis [J].
Zhu, Minfeng ;
Pan, Pingbo ;
Chen, Wei ;
Yang, Yi .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5795-5803