Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks

被引：0

作者：

Dong, Pei ^{[1
]}

Wu, Lei ^{[1
]}

Li, Ruichen ^{[1
]}

Meng, Xiangxu ^{[1
]}

Meng, Lei ^{[1
]}

机构：

[1] Shandong Univ, Sch Software, 1500 ShunHua Rd High Tech Ind Dev Zone, Jinan 250101, Peoples R China

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2024年 / 245卷

关键词：

Generative adversarial network; Multi-granularity feature aware enhancement; Text-to-image; Autoregressive; Diffusion;

D O I：

10.1016/j.cviu.2024.104042

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Synthesizing complex images from text presents challenging. Compared to autoregressive and diffusion modelbased methods, Generative Adversarial Network -based methods have significant advantages in terms of computational cost and generation efficiency yet remain two limitations: first, these methods often refine all features output from the previous stage indiscriminately, without considering these features are initialized gradually during the generation process; second, the sparse semantic constraints provided by the text description are typically ineffective for refining fine-grained features. These issues complicate the balance between generation quality, computational cost and inference speed. To address these issues, we propose a Multi -granularity Feature Aware Enhancement GAN (MFAE-GAN), which allows the refinement process to match the order of different granularity features being initialized. Specifically, MFAE-GAN (1) samples category -related coarse -grained features and instance -level detail -related fine-grained features at different generation stages based on different attention mechanisms in Coarse -grained Feature Enhancement (CFE) and Fine-grained Feature Enhancement (FFE) to guide the generation process spatially, (2) provides denser semantic constraints than textual semantic information through Multi -granularity Features Adaptive Batch Normalization (MFA-BN) in the process of refining fine-grained features, and (3) adopts a Global Semantics Preservation (GSP) to avoid the loss of global semantics when sampling features continuously. Extensive experimental results demonstrate that our MFAE-GAN is competitive in terms of both image generation quality and efficiency.

引用

页数：11

共 50 条

[41] Contextual Information Aggregation and Multi-Scale Feature Fusion for Single Image De-Raining in Generative Adversarial Networks
Zhao, Jia
Chen, Ming
Pan, Jeng-Shyang
Han, Longzhe
Qiu, Shenyu
Nie, Zhaoxiu
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2025, 37 (03):
[42] Image Text Deblurring Method Based on Generative Adversarial Network
Wu, Chunxue
Du, Haiyan
Wu, Qunhui
Zhang, Sheng
ELECTRONICS, 2020, 9 (02)
[43] Real image noise aware steganography with image denoising and generative adversarial network
Toguchi, Shinnosuke
Miyata, Takamichi
IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2024, 15 (04): : 737 - 749
[44] Generative Adversarial Network Based on Multi-scale Dense Feature Fusion for Image Dehazing
Lian J.
Chen S.
Ding K.
Li L.-H.
Dongbei Daxue Xuebao/Journal of Northeastern University, 2022, 43 (11): : 1591 - 1598
[45] Modified-generative adversarial networks for imbalance text classification
Poonam Rani
Om Prakash Verma
Multimedia Tools and Applications, 2025, 84 (14) : 13865 - 13884
[46] Synthetic Dataset Generation for Text Recognition with Generative Adversarial Networks
Efimova, Valeria
Shalamov, Viacheslav
Filchenkov, Andrey
TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433
[47] Common feature learning for brain tumor MRI synthesis by context-aware generative adversarial network
Huang, Pu
Li, Dengwang
Jiao, Zhicheng
Wei, Dongming
Cao, Bing
Mo, Zhanhao
Wang, Qian
Zhang, Han
Shen, Dinggang
MEDICAL IMAGE ANALYSIS, 2022, 79
[48] A General Endoscopic Image Enhancement Method Based on Pre-trained Generative Adversarial Networks
Li, Yating
Fan, Jingfan
Ai, Danni
Song, Hong
Wang, Yongtian
Yang, Jian
2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2403 - 2408
[49] Underwater Acoustic Image Enhancement by Using Fast Super-Resolution with Generative Adversarial Networks
Bucci, Alessandro
Topini, Alberto
Franchi, Matteo
Zacchini, Leonardo
Secciani, Nicola
Ridolfi, Alessandro
GLOBAL OCEANS 2020: SINGAPORE - U.S. GULF COAST, 2020,
[50] Mural inpainting with generative adversarial networks based on multi-scale feature and attention fusion
Chen Y.
Chen J.
Tao M.
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (02): : 254 - 264

← 1 2 3 4 5 →