KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis

被引：50

作者：

Tan, Hongchen ^{[1
]}

Liu, Xiuping ^{[1
]}

Liu, Meng ^{[2
]}

Yin, Baocai ^{[3
,4
]}

Li, Xin ^{[5
,6
]}

机构：

[1] Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China

[2] Shandong Jianzhu Univ, Sch Comp Sci & Technol, Jinan 250101, Peoples R China

[3] Dalian Univ Technol, Dept Elect Informat & Elect Engn, Dalian 116024, Peoples R China

[4] Peng Cheng Lab, Shenzhen 518055, Peoples R China

[5] Louisiana State Univ, Sch Elect Engn & Comp Sci, Baton Rouge, LA 70803 USA

[6] Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2021年 / 30卷

基金：

中国国家自然科学基金; 美国国家科学基金会;

关键词：

Task analysis; Semantics; Generators; Generative adversarial networks; Knowledge engineering; Feature extraction; Image generation; Generative adversarial network; knowledge distillation; Text-to-Image Generation; alternate attention-transfer mechanism;

D O I：

10.1109/TIP.2020.3026728

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a new framework, Knowledge-Transfer Generative Adversarial Network (KT-GAN), for fine-grained text-to-image generation. We introduce two novel mechanisms: an Alternate Attention-Transfer Mechanism (AATM) and a Semantic Distillation Mechanism (SDM), to help generator better bridge the cross-domain gap between text and image. The AATM updates word attention weights and attention weights of image sub-regions alternately, to progressively highlight important word information and enrich details of synthesized images. The SDM uses the image encoder trained in the Image-to-Image task to guide training of the text encoder in the Text-to-Image task, for generating better text features and higher-quality images. With extensive experimental validation on two public datasets, our KT-GAN outperforms the baseline method significantly, and also achieves the competive results over different evaluation metrics.

引用

页码：1275 / 1290

页数：16

共 43 条

[1] [Anonymous], 2016, P NIPS
[2] [Anonymous], 2017, P IEEE C COMP VIS PA
[3] [Anonymous], 2017, P ADV NEUR INF PROC
[4] Bharti M., 2019, PROC 2 INT C INNOVAT, P1
[5] Castro F. M., 2018, P EUROPEAN C COMPUTE
[6] Dean J, 2015, P INT C NEUR INF PRO, P1
[7] Faghri F, 2018, P BRIT MACH VIS C
[8] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[9] Neural Compatibility Modeling With Probabilistic Knowledge Distillation
Han, Xianjing
Song, Xuemeng
Yao, Yiyang
Xu, Xin-Shun
Nie, Liqiang
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 871 - 882
[10] Hinz T., 2019, P ICLR, P1

← 1 2 3 4 5 →