MGF-GAN: Multi Granularity Text Feature Fusion for Text-guided-Image Synthesis

被引：1

作者：

Wang, Xingfu ^{[1
]}

Li, Xiangyu ^{[1
]}

Hawbani, Ammar ^{[1
]}

Zhao, Liang ^{[2
]}

Alsamhi, Saeed Hamood ^{[3
,4
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China

[2] Shenyang Aerosp Univ, Sch Comp Sci, Shenyang, Peoples R China

[3] Natl Univ Ireland, Insight Ctr Data Analyt, Galway, Ireland

[4] IBB Univ, Ibb, Yemen

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM | 2022年

关键词：

Text-guided-Image; GAN; Aspect-level; Semantic consistency;

D O I：

10.1109/TrustCom56396.2022.00197

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We have made research achievements worth sharing on the complicated topic of text-to-image synthesis. Our analysis of popular articles shows that they often use stacked structures to construct and generate confrontation network models and usually introduce multiple sets of generators and discriminator pairs. The entanglement between different generators affects the quality of the final synthesized image. Some researchers have proposed a single-stage network model to avoid traps between multiple generators, But it lacks the use of unstructured natural language information with different granularity. To correct this serious defect, we propose a multi-granularity feature network MGFGAN, which plays the role of text information with different granularity based on the advantages of the single-stage network. Specifically, we input the three granularity features of the text, including sentences, aspect words, and single words of text, into different stages of the model through spatial attention and channel attention mechanisms to gradually refine the synthetic image from global and local perspectives. In addition, we reconstruct the loss function based on the contrast concept to stabilize the training and ensure that the visual meaning between the synthesized image and the natural language is consistent. We conducted validity experiments on CUB bird and COCO. The significant effect is sufficient to prove the effectiveness and advancement of our MGF-GAN.

引用

页码：1398 / 1403

页数：6

共 50 条

[1] GMF-GAN: Gradual multi-granularity semantic fusion GAN for text-to-image synthesis
Jin, Dehu
Li, Guangju
Yu, Qi
Yu, Lan
Cui, Jia
Qi, Meng
DIGITAL SIGNAL PROCESSING, 2023, 140
[2] SAW-GAN: Multi-granularity Text Fusion Generative Adversarial Networks for text-to-image generation
Jin, Dehu
Yu, Qi
Yu, Lan
Qi, Meng
KNOWLEDGE-BASED SYSTEMS, 2024, 294
[3] Multi-granularity Feature Attention Fusion Network for Image-Text Sentiment Analysis
Sun, Tao
Wang, Shuang
Zhong, Shenjie
ADVANCES IN COMPUTER GRAPHICS, CGI 2022, 2022, 13443 : 3 - 14
[4] Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks
Dong, Pei
Wu, Lei
Li, Ruichen
Meng, Xiangxu
Meng, Lei
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 245
[5] Text-guided floral image generation based on lightweight deep attention feature fusion GAN
Yang, Wenji
An, Hang
Hu, Wenchao
Ma, Xinxin
Xie, Liping
VISUAL COMPUTER, 2024, : 3519 - 3535
[6] FA-GAN: FEATURE-AWARE GAN FOR TEXT TO IMAGE SYNTHESIS
Jeon, Eunyeong
Kim, Kunhee
Kim, Daijin
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2443 - 2447
[7] Modified GAN with Proposed Feature Set for Text-to-Image Synthesis
Talasila, Vamsidhar
Narasingarao, M. R.
Mohan, V. Murali
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (04)
[8] Multi-Granularity Feature Fusion for Image-Guided Story Ending Generation
Li, Pijian
Huang, Qingbao
Li, Zhigang
Cai, Yi
Shuang, Feng
Li, Qing
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3437 - 3449
[9] Text to Image Synthesis based on Multi-Perspective Fusion
Zhang, Zhiqiang
Fu, Chen
Zhou, Jinjia
Yu, Wenxin
Jiang, Ning
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[10] Text Guided Person Image Synthesis
Zhou, Xingran
Huang, Siyu
Li, Bin
Li, Yingming
Li, Jiachen
Zhang, Zhongfei
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3658 - 3667

← 1 2 3 4 5 →