CF-GAN: cross-domain feature fusion generative adversarial network for text-to-image synthesis

被引：17

作者：

Zhang, Yubo ^{[1
]}

Han, Shuang ^{[1
]}

Zhang, Zhongxin ^{[1
]}

Wang, Jianyang ^{[1
]}

Bi, Hongbo ^{[1
]}

机构：

[1] Northeast Petr Univ, Sch Elect Informat Engn, Daqing 163000, Peoples R China

来源：

VISUAL COMPUTER | 2023年 / 39卷 / 04期

关键词：

Text-to-image; Generative adversarial networks; Residual structure; Deep learning;

D O I：

10.1007/s00371-022-02404-6

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In recent years, generative adversarial networks have successfully synthesized images through text descriptions. However, there are still problems that the generated image cannot be deeply embedded in the text description semantics, the target object of the generated image is incomplete, and the texture structure of the target object is not rich enough. Consequently, we propose a network framework, cross-domain feature fusion generative adversarial network (CF-GAN), which includes two modules, feature fusion-enhanced response module (FFERM) and multi-branch residual module (MBRM), to fine-grain the generated images with the way of deep fusion. FFERM can integrate both the word-level vector features and image features deeply. MBRM is a relatively simple and innovative residual network structure instead of the traditional residual module to extract features fully. We conducted experiments on the CUB and COCO datasets, and the results reveal that the Inception Score has improved from 4.36 to 4.83 (increased by 10.78%) on the CUB dataset, compared with AttnGAN. Compared with DM-GAN, the Inception Score has increased from 30.49 to 31.13 (increased by 2.06%) on the COCO dataset. Extensive experiments and ablation studies demonstrate the proposed CF-GAN's superiority compared to other methods.

引用

页码：1283 / 1293

页数：11

共 36 条

[1] Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space [J].

Anh Nguyen ;

Clune, Jeff ;

Bengio, Yoshua ;

Dosovitskiy, Alexey ;

Yosinski, Jason .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3510-3520

[2]

[Anonymous], 2017, TAC-GAN-text conditioned auxiliary classifier generative adversarial network

[3]

[Anonymous], 2015, ACS SYM SER

[4]

Bishop C. M., 1995, Neural networks for pattern recognition

[5]

Chen X., 2019, ARXIV PREPRINT ARXIV

[6] Semantic Image Synthesis via Adversarial Learning [J].

Dong, Hao ;

Yu, Simiao ;

Wu, Chao ;

Guo, Yike .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :CP1-CP38

[7]

Goodfellow I., 2015, P 3 INT C LEARNING R, P1

[8] GhostNet: More Features from Cheap Operations [J].

Han, Kai ;

Wang, Yunhe ;

Tian, Qi ;

Guo, Jianyuan ;

Xu, Chunjing ;

Xu, Chang .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1577-1586

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10]

Li B, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P1991

← 1 2 3 4 →