KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis

被引:50
作者
Tan, Hongchen [1 ]
Liu, Xiuping [1 ]
Liu, Meng [2 ]
Yin, Baocai [3 ,4 ]
Li, Xin [5 ,6 ]
机构
[1] Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China
[2] Shandong Jianzhu Univ, Sch Comp Sci & Technol, Jinan 250101, Peoples R China
[3] Dalian Univ Technol, Dept Elect Informat & Elect Engn, Dalian 116024, Peoples R China
[4] Peng Cheng Lab, Shenzhen 518055, Peoples R China
[5] Louisiana State Univ, Sch Elect Engn & Comp Sci, Baton Rouge, LA 70803 USA
[6] Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Task analysis; Semantics; Generators; Generative adversarial networks; Knowledge engineering; Feature extraction; Image generation; Generative adversarial network; knowledge distillation; Text-to-Image Generation; alternate attention-transfer mechanism;
D O I
10.1109/TIP.2020.3026728
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new framework, Knowledge-Transfer Generative Adversarial Network (KT-GAN), for fine-grained text-to-image generation. We introduce two novel mechanisms: an Alternate Attention-Transfer Mechanism (AATM) and a Semantic Distillation Mechanism (SDM), to help generator better bridge the cross-domain gap between text and image. The AATM updates word attention weights and attention weights of image sub-regions alternately, to progressively highlight important word information and enrich details of synthesized images. The SDM uses the image encoder trained in the Image-to-Image task to guide training of the text encoder in the Text-to-Image task, for generating better text features and higher-quality images. With extensive experimental validation on two public datasets, our KT-GAN outperforms the baseline method significantly, and also achieves the competive results over different evaluation metrics.
引用
收藏
页码:1275 / 1290
页数:16
相关论文
共 43 条
  • [1] [Anonymous], 2016, P NIPS
  • [2] [Anonymous], 2017, P IEEE C COMP VIS PA
  • [3] [Anonymous], 2017, P ADV NEUR INF PROC
  • [4] Bharti M., 2019, PROC 2 INT C INNOVAT, P1
  • [5] Castro F. M., 2018, P EUROPEAN C COMPUTE
  • [6] Dean J, 2015, P INT C NEUR INF PRO, P1
  • [7] Faghri F, 2018, P BRIT MACH VIS C
  • [8] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
  • [9] Neural Compatibility Modeling With Probabilistic Knowledge Distillation
    Han, Xianjing
    Song, Xuemeng
    Yao, Yiyang
    Xu, Xin-Shun
    Nie, Liqiang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 871 - 882
  • [10] Hinz T., 2019, P ICLR, P1