MGF-GAN: Multi Granularity Text Feature Fusion for Text-guided-Image Synthesis

被引:1
|
作者
Wang, Xingfu [1 ]
Li, Xiangyu [1 ]
Hawbani, Ammar [1 ]
Zhao, Liang [2 ]
Alsamhi, Saeed Hamood [3 ,4 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China
[2] Shenyang Aerosp Univ, Sch Comp Sci, Shenyang, Peoples R China
[3] Natl Univ Ireland, Insight Ctr Data Analyt, Galway, Ireland
[4] IBB Univ, Ibb, Yemen
来源
2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM | 2022年
关键词
Text-guided-Image; GAN; Aspect-level; Semantic consistency;
D O I
10.1109/TrustCom56396.2022.00197
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We have made research achievements worth sharing on the complicated topic of text-to-image synthesis. Our analysis of popular articles shows that they often use stacked structures to construct and generate confrontation network models and usually introduce multiple sets of generators and discriminator pairs. The entanglement between different generators affects the quality of the final synthesized image. Some researchers have proposed a single-stage network model to avoid traps between multiple generators, But it lacks the use of unstructured natural language information with different granularity. To correct this serious defect, we propose a multi-granularity feature network MGFGAN, which plays the role of text information with different granularity based on the advantages of the single-stage network. Specifically, we input the three granularity features of the text, including sentences, aspect words, and single words of text, into different stages of the model through spatial attention and channel attention mechanisms to gradually refine the synthetic image from global and local perspectives. In addition, we reconstruct the loss function based on the contrast concept to stabilize the training and ensure that the visual meaning between the synthesized image and the natural language is consistent. We conducted validity experiments on CUB bird and COCO. The significant effect is sufficient to prove the effectiveness and advancement of our MGF-GAN.
引用
收藏
页码:1398 / 1403
页数:6
相关论文
共 50 条
  • [31] MISL: Multi-grained image-text semantic learning for text-guided image inpainting
    Wu, Xingcai
    Zhao, Kejun
    Huang, Qianding
    Wang, Qi
    Yang, Zhenguo
    Hao, Gefei
    PATTERN RECOGNITION, 2024, 145
  • [32] MIGT: Multi-modal image inpainting guided with text
    Li, Ailin
    Zhao, Lei
    Zuo, Zhiwen
    Wang, Zhizhong
    Xing, Wei
    Lu, Dongming
    NEUROCOMPUTING, 2023, 520 : 376 - 385
  • [33] A Comparison between AttnGAN and DF GAN: Text to Image Synthesis
    Sumi, Philo
    Sindhuja, S.
    Sureshkumar, S.
    ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 615 - 619
  • [34] Study on Feature Layer fusion Classification Model on Text/Image Information
    Zhang, Xiao-Dan
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 1050 - 1053
  • [35] Study on Feature Layer fusion Classification Model on Text/Image Information
    Zhang, Xiao-Dan
    2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL IV, 2010, : 196 - 198
  • [36] MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting
    Lin, Qing
    Yan, Bo
    Li, Jichun
    Tan, Weimin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1094 - 1102
  • [37] DMF-GAN: Deep Multimodal Fusion Generative Adversarial Networks for Text-to-Image Synthesis
    Yang, Bing
    Xiang, Xueqin
    Kong, Wangzeng
    Zhang, Jianhai
    Peng, Yong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6956 - 6967
  • [38] Image and Encoded Text Fusion for Multi-Modal Classification
    Gallo, I.
    Calefati, A.
    Nawaz, S.
    Janjua, M. K.
    2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, : 203 - 209
  • [39] Edge consistent image completion based on multi-granularity feature fusion
    Zhang S.-Y.
    Wang G.-Y.
    Liu Q.
    Wang R.-Q.
    Kongzhi yu Juece/Control and Decision, 2022, 37 (12): : 3240 - 3250
  • [40] EMF-Net: An edge-guided multi-feature fusion network for text manipulation detection
    Ren, Ruyong
    Hao, Qixian
    Gu, Feng
    Niu, Shaozhang
    Zhang, Jiwei
    Wang, Maosen
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249