Multi-Semantic Fusion Generative Adversarial Network for Text-to-Image Generation

被引:2
作者
Huang, Pingda [1 ]
Liu, Yedan [1 ]
Fu, Chunjiang [1 ]
Zhao, Liang [1 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Dalian 116600, Peoples R China
来源
2023 IEEE 8TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS, ICBDA | 2023年
基金
中国国家自然科学基金;
关键词
text-to-image; GANs; feature fusion; BERT;
D O I
10.1109/ICBDA57405.2023.10104850
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-to-image synthesis always faces two challenges: image quality and image-text alignment. Existing methods mainly use a single sentence to synthesize images, which are challenging to extract adequate semantic features, resulting in the generated images being far apart from ground-truth images. In this paper, we propose a novel Multi-Semantic Fusion Generative Adversarial Network. Our model can fuse the same semantics from different sentences and preserve their unique semantics to generate accurate images. In addition, we have designed a multi-sentence joint discriminator to ensure that the generated images match all sentences. Experiments on CUB and MSCOCO datasets demonstrate that our model has significant advantages.
引用
收藏
页码:159 / 164
页数:6
相关论文
共 23 条
[1]  
Cheng J., 2020, CVPR, P10911
[2]  
Cheng Qingrong, 2022, IEEE T MULTIMEDIA
[3]   Effectively Unbiased FID and Inception Score and where to find them [J].
Chong, Min Jin ;
Forsyth, David .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6069-6078
[4]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[5]   Generative Adversarial Networks [J].
Goodfellow, Ian ;
Pouget-Abadie, Jean ;
Mirza, Mehdi ;
Xu, Bing ;
Warde-Farley, David ;
Ozair, Sherjil ;
Courville, Aaron ;
Bengio, Yoshua .
COMMUNICATIONS OF THE ACM, 2020, 63 (11) :139-144
[6]  
Heusel M, 2017, ADV NEUR IN, V30
[7]   Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks [J].
Kaneko, Takuhiro ;
Hiramatsu, Kaoru ;
Kashino, Kunio .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7006-7015
[8]   Perceptual Generative Adversarial Networks for Small Object Detection [J].
Li, Jianan ;
Liang, Xiaodan ;
Wei, Yunchao ;
Xu, Tingfa ;
Feng, Jiashi ;
Yan, Shuicheng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1951-1959
[9]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[10]   SAM-GAN: Self-Attention supporting Multi-stage Generative Adversarial Networks for text-to-image synthesis [J].
Peng, Dunlu ;
Yang, Wuchen ;
Liu, Cong ;
Lu, Shuairui .
NEURAL NETWORKS, 2021, 138 :57-67