A Probabilistic Zero-Shot Learning Method via Latent Nonnegative Prototype Synthesis of Unseen Classes

被引:19
作者
Zhang, Haofeng [1 ]
Mao, Huaqi [1 ]
Long, Yang [2 ]
Yang, Wankou [3 ]
Shao, Ling [4 ,5 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[3] Southeast Univ, Sch Automat, Nanjing 210018, Peoples R China
[4] Incept Inst Artificial Intelligence IIAI, Abu Dhabi 999041, U Arab Emirates
[5] Univ Chinese Acad Sci, Sch Engn Sci, Beijing 100864, Peoples R China
基金
美国国家科学基金会; 英国医学研究理事会;
关键词
Prototypes; Visualization; Training; Artificial neural networks; Probabilistic logic; Semantics; Covariance matrices; Nonnegative matrix factorization (NMF); prototype synthesis; triplet network; zero-shot learning (ZSL);
D O I
10.1109/TNNLS.2019.2955157
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot learning (ZSL), a type of structured multioutput learning, has attracted much attention due to its requirement of no training data for target classes. Conventional ZSL methods usually project visual features into semantic space and assign labels by finding their nearest prototypes. However, this type of nearest neighbor search (NNS)-based method often suffers from great performance degradation because of the nonuniform variances between different categories. In this article, we propose a probabilistic framework by taking covariance into account to deal with the above-mentioned problem. In this framework, we define a new latent space, which has two characteristics. The first is that the features in this space should gather within the classes and scatter between the classes, which is implemented by triplet learning; the second is that the prototypes of unseen classes are synthesized with nonnegative coefficients, which are generated by nonnegative matrix factorization (NMF) of relations between the seen classes and the unseen classes in attribute space. During training, the learned parameters are the projection model for triplet network and the nonnegative coefficients between the unseen classes and the seen classes. In the testing phase, visual features are projected into latent space and assigned with the labels that have the maximum probability among unseen classes for classic ZSL or within all classes for generalized ZSL. Extensive experiments are conducted on four popular data sets, and the results show that the proposed method can outperform the state-of-the-art methods in most circumstances.
引用
收藏
页码:2361 / 2375
页数:15
相关论文
共 55 条
  • [1] Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911
  • [2] Label-Embedding for Attribute-Based Classification
    Akata, Zeynep
    Perronnin, Florent
    Harchaoui, Zaid
    Schmid, Cordelia
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 819 - 826
  • [3] Preserving Semantic Relations for Zero-Shot Learning
    Annadani, Yashas
    Biswas, Soma
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7603 - 7612
  • [4] [Anonymous], ADV NEURAL INFORM PR
  • [5] [Anonymous], 2017, ARXIV170306389
  • [6] Atzmon Y., 2018, P UAI, P1
  • [7] Bishop C.M., 2006, Pattern Recognition and Machine Learning, VVolume 4
  • [8] Synthesized Classifiers for Zero-Shot Learning
    Changpinyo, Soravit
    Chao, Wei-Lun
    Gong, Boqing
    Sha, Fei
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5327 - 5336
  • [9] An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild
    Chao, Wei-Lun
    Changpinyo, Soravit
    Gong, Boqing
    Sha, Fei
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 52 - 68
  • [10] Sparse channel estimation via matching pursuit with application to equalization
    Cotter, SF
    Rao, BD
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2002, 50 (03) : 374 - 377