Learning discriminative and representative feature with cascade GAN for generalized zero-shot learning

被引:15
作者
Liu, Jingren [1 ]
Fu, Liyong [2 ]
Zhang, Haofeng [1 ]
Ye, Qiaolin [3 ]
Yang, Wankou [4 ]
Liu, Li [5 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[2] Chinese Acad Forestry, Inst Forest Resource Informat Tech, Beijing, Peoples R China
[3] Nanjing Forestry Univ, E Coll Informat Sci & Technol, Nanjing, Peoples R China
[4] Southeast Univ, Sch Automat, Nanjing, Peoples R China
[5] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
中国国家自然科学基金;
关键词
Generalized zero-shot learning; Generative models; Orthogonality; Cascade GAN;
D O I
10.1016/j.knosys.2021.107780
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-Shot Learning (ZSL) aims to employ seen images and their related semantics to identify unseen images through knowledge transfer. Among past numerous methods, the generative methods are more prominent and achieve better results than other methods. However, we find the input for generating samples is too monotonous, there are only semantics of each class and artificially defined noise, which makes the generated visual features non-discriminative and the classifier cannot effectively distinguish them. In order to solve this problem, we propose a novel approach with cascade Generative Adversarial Network (GAN) to generate discriminative and representative features. In this method, we define a latent space where the features from different categories are orthogonal to each other and the generator for this latent space is learned with a Wasserstein GAN. In addition, in order to make up for the deficiency that the features in this latent space cannot accurately simulate the true distribution of species, we utilize another Wasserstein GAN or Cramer GAN cascaded with the previous one to generate more discriminative and representative visual features. In this way, we can not only expand the content used as input in the generation process, but also make the final generated visual features clear and separable under the influence of latent spatial orthogonality. Extensive experiments on five benchmark datasets, i.e., AWA1, AWA2, CUB, SUN and APY, demonstrate that our proposed method can outperform most of the state-of-the-art methods on both conventional and generalized zero-shot learning settings. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:13
相关论文
共 62 条
[51]   Zero-Shot Learning-A Comprehensive Evaluation of the Good, the Bad and the Ugly [J].
Xian, Yongqin ;
Lampert, Christoph H. ;
Schiele, Bernt ;
Akata, Zeynep .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (09) :2251-2265
[52]   Progressive Ensemble Networks for Zero-Shot Recognition [J].
Ye, Meng ;
Guo, Yuhong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11720-11728
[53]   StackGAN plus plus : Realistic Image Synthesis with Stacked Generative Adversarial Networks [J].
Zhang, Han ;
Xu, Tao ;
Li, Hongsheng ;
Zhang, Shaoting ;
Wang, Xiaogang ;
Huang, Xiaolei ;
Metaxas, Dimitris N. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (08) :1947-1962
[54]   Deep transductive network for generalized zero shot learning [J].
Zhang, Haofeng ;
Liu, Li ;
Long, Yang ;
Zhang, Zheng ;
Shao, Ling .
PATTERN RECOGNITION, 2020, 105
[55]   Zero-shot leaning and hashing with binary visual similes [J].
Zhang, Haofeng ;
Long, Yang ;
Shao, Ling .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) :24147-24165
[56]   Triple Verification Network for Generalized Zero-Shot Learning [J].
Zhang, Haofeng ;
Long, Yang ;
Guan, Yu ;
Shao, Ling .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) :506-517
[57]   The Unreasonable Effectiveness of Deep Features as a Perceptual Metric [J].
Zhang, Richard ;
Isola, Phillip ;
Efros, Alexei A. ;
Shechtman, Eli ;
Wang, Oliver .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :586-595
[58]   Scalable Supervised Asymmetric Hashing With Semantic and Latent Factor Embedding [J].
Zhang, Zheng ;
Lai, Zhihui ;
Huang, Zi ;
Wong, Wai Keung ;
Xie, Guo-Sen ;
Liu, Li ;
Shao, Ling .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (10) :4803-4818
[59]   Zero-Shot Learning via Semantic Similarity Embedding [J].
Zhang, Ziming ;
Saligrama, Venkatesh .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4166-4174
[60]   Generalized Zero-Shot Recognition based on Visually Semantic Embedding [J].
Zhu, Pengkai ;
Wang, Hanxiao ;
Saligrama, Venkatesh .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2990-2998