MGF-GAN: Multi Granularity Text Feature Fusion for Text-guided-Image Synthesis

被引:1
|
作者
Wang, Xingfu [1 ]
Li, Xiangyu [1 ]
Hawbani, Ammar [1 ]
Zhao, Liang [2 ]
Alsamhi, Saeed Hamood [3 ,4 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China
[2] Shenyang Aerosp Univ, Sch Comp Sci, Shenyang, Peoples R China
[3] Natl Univ Ireland, Insight Ctr Data Analyt, Galway, Ireland
[4] IBB Univ, Ibb, Yemen
来源
2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM | 2022年
关键词
Text-guided-Image; GAN; Aspect-level; Semantic consistency;
D O I
10.1109/TrustCom56396.2022.00197
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We have made research achievements worth sharing on the complicated topic of text-to-image synthesis. Our analysis of popular articles shows that they often use stacked structures to construct and generate confrontation network models and usually introduce multiple sets of generators and discriminator pairs. The entanglement between different generators affects the quality of the final synthesized image. Some researchers have proposed a single-stage network model to avoid traps between multiple generators, But it lacks the use of unstructured natural language information with different granularity. To correct this serious defect, we propose a multi-granularity feature network MGFGAN, which plays the role of text information with different granularity based on the advantages of the single-stage network. Specifically, we input the three granularity features of the text, including sentences, aspect words, and single words of text, into different stages of the model through spatial attention and channel attention mechanisms to gradually refine the synthetic image from global and local perspectives. In addition, we reconstruct the loss function based on the contrast concept to stabilize the training and ensure that the visual meaning between the synthesized image and the natural language is consistent. We conducted validity experiments on CUB bird and COCO. The significant effect is sufficient to prove the effectiveness and advancement of our MGF-GAN.
引用
收藏
页码:1398 / 1403
页数:6
相关论文
共 50 条
  • [1] GMF-GAN: Gradual multi-granularity semantic fusion GAN for text-to-image synthesis
    Jin, Dehu
    Li, Guangju
    Yu, Qi
    Yu, Lan
    Cui, Jia
    Qi, Meng
    DIGITAL SIGNAL PROCESSING, 2023, 140
  • [2] SAW-GAN: Multi-granularity Text Fusion Generative Adversarial Networks for text-to-image generation
    Jin, Dehu
    Yu, Qi
    Yu, Lan
    Qi, Meng
    KNOWLEDGE-BASED SYSTEMS, 2024, 294
  • [3] Multi-granularity Feature Attention Fusion Network for Image-Text Sentiment Analysis
    Sun, Tao
    Wang, Shuang
    Zhong, Shenjie
    ADVANCES IN COMPUTER GRAPHICS, CGI 2022, 2022, 13443 : 3 - 14
  • [4] Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks
    Dong, Pei
    Wu, Lei
    Li, Ruichen
    Meng, Xiangxu
    Meng, Lei
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 245
  • [5] Text-guided floral image generation based on lightweight deep attention feature fusion GAN
    Yang, Wenji
    An, Hang
    Hu, Wenchao
    Ma, Xinxin
    Xie, Liping
    VISUAL COMPUTER, 2024, : 3519 - 3535
  • [6] FA-GAN: FEATURE-AWARE GAN FOR TEXT TO IMAGE SYNTHESIS
    Jeon, Eunyeong
    Kim, Kunhee
    Kim, Daijin
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2443 - 2447
  • [7] Modified GAN with Proposed Feature Set for Text-to-Image Synthesis
    Talasila, Vamsidhar
    Narasingarao, M. R.
    Mohan, V. Murali
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (04)
  • [8] Multi-Granularity Feature Fusion for Image-Guided Story Ending Generation
    Li, Pijian
    Huang, Qingbao
    Li, Zhigang
    Cai, Yi
    Shuang, Feng
    Li, Qing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3437 - 3449
  • [9] Text to Image Synthesis based on Multi-Perspective Fusion
    Zhang, Zhiqiang
    Fu, Chen
    Zhou, Jinjia
    Yu, Wenxin
    Jiang, Ning
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [10] Text Guided Person Image Synthesis
    Zhou, Xingran
    Huang, Siyu
    Li, Bin
    Li, Yingming
    Li, Jiachen
    Zhang, Zhongfei
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3658 - 3667