Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks

被引：0

作者：

Dong, Pei ^{[1
]}

Wu, Lei ^{[1
]}

Li, Ruichen ^{[1
]}

Meng, Xiangxu ^{[1
]}

Meng, Lei ^{[1
]}

机构：

[1] Shandong Univ, Sch Software, 1500 ShunHua Rd High Tech Ind Dev Zone, Jinan 250101, Peoples R China

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2024年 / 245卷

关键词：

Generative adversarial network; Multi-granularity feature aware enhancement; Text-to-image; Autoregressive; Diffusion;

D O I：

10.1016/j.cviu.2024.104042

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Synthesizing complex images from text presents challenging. Compared to autoregressive and diffusion modelbased methods, Generative Adversarial Network -based methods have significant advantages in terms of computational cost and generation efficiency yet remain two limitations: first, these methods often refine all features output from the previous stage indiscriminately, without considering these features are initialized gradually during the generation process; second, the sparse semantic constraints provided by the text description are typically ineffective for refining fine-grained features. These issues complicate the balance between generation quality, computational cost and inference speed. To address these issues, we propose a Multi -granularity Feature Aware Enhancement GAN (MFAE-GAN), which allows the refinement process to match the order of different granularity features being initialized. Specifically, MFAE-GAN (1) samples category -related coarse -grained features and instance -level detail -related fine-grained features at different generation stages based on different attention mechanisms in Coarse -grained Feature Enhancement (CFE) and Fine-grained Feature Enhancement (FFE) to guide the generation process spatially, (2) provides denser semantic constraints than textual semantic information through Multi -granularity Features Adaptive Batch Normalization (MFA-BN) in the process of refining fine-grained features, and (3) adopts a Global Semantics Preservation (GSP) to avoid the loss of global semantics when sampling features continuously. Extensive experimental results demonstrate that our MFAE-GAN is competitive in terms of both image generation quality and efficiency.

引用

页数：11

共 50 条

[21] SAGAN: Deep semantic-aware generative adversarial network for unsupervised image enhancement
She, Chunyan
Chen, Tao
Duan, Shukai
Wang, Lidan
KNOWLEDGE-BASED SYSTEMS, 2023, 281
[22] Generating Long and Coherent Text with Multi-Level Generative Adversarial Networks
Tang, Tianyi
Li, Junyi
Zhao, Wayne Xin
Wen, Ji-Rong
WEB AND BIG DATA, APWEB-WAIM 2021, PT II, 2021, 12859 : 49 - 63
[23] CF-GAN: cross-domain feature fusion generative adversarial network for text-to-image synthesis
Zhang, Yubo
Han, Shuang
Zhang, Zhongxin
Wang, Jianyang
Bi, Hongbo
VISUAL COMPUTER, 2023, 39 (04): : 1283 - 1293
[24] CF-GAN: cross-domain feature fusion generative adversarial network for text-to-image synthesis
Yubo Zhang
Shuang Han
Zhongxin Zhang
Jianyang Wang
Hongbo Bi
The Visual Computer, 2023, 39 : 1283 - 1293
[25] Class-Balanced Text to Image Synthesis With Attentive Generative Adversarial Network
Wang, Min
Lang, Congyan
Liang, Liqian
Lyu, Gengyu
Feng, Songhe
Wang, Tao
IEEE MULTIMEDIA, 2021, 28 (03) : 21 - 31
[26] Core-attributes enhanced generative adversarial networks for robust image enhancement
Liu, Shan
Xiao, Guoqiang
Lew, Michael S.
Gao, Xinbo
Wu, Song
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 131
[27] Super-resolution Thermal Generative Adversarial Networks for Infrared Image Enhancement
Lee I.H.
Chung W.Y.
Park C.G.
Journal of Institute of Control, Robotics and Systems, 2022, 28 (02) : 153 - 160
[28] MF-GAN: Multi-conditional Fusion Generative Adversarial Network for Text-to-Image Synthesis
Yang, Yuyan
Ni, Xin
Hao, Yanbin
Liu, Chenyu
Wang, Wenshan
Liu, Yifeng
Xie, Haiyong
MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 41 - 53
[29] A multi-granularity knowledge association model of geological text based on hypernetwork
Zhuang, Can
Li, Wenjia
Xie, Zhong
Wu, Liang
EARTH SCIENCE INFORMATICS, 2021, 14 (01) : 227 - 246
[30] A Domain Gap Aware Generative Adversarial Network for Multi-Domain Image Translation
Xu, Wenju
Wang, Guanghui
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 72 - 84

← 1 2 3 4 5 →