Manifold regularized cross-modal embedding for zero-shot learning

被引:32
|
作者
Ji, Zhong [1 ]
Yu, Yunlong [1 ]
Pang, Yanwei [1 ]
Guo, Jichang [1 ]
Zhang, Zhongfei [2 ]
机构
[1] Tianjin Univ, Sch Elect Informat Engn, Tianjin 300072, Peoples R China
[2] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
基金
中国国家自然科学基金;
关键词
Zero-shot learning; Image classification; Cross-modal embedding; Manifold; Domain adaptation; RECOGNITION;
D O I
10.1016/j.ins.2016.10.025
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-Shot Learning (ZSL) aims at classifying previously unseen class samples and has gained its popularity in applications where samples of some categories are scarce for training. The basic idea to address this issue is transferring knowledge from the seen classes to the unseen classes through mapping the visual feature to an embedding space spanned by class semantic information. The class semantic information can be obtained from human labeled attributes or text corpus in an unsupervised fashion. Therefore, the embedding function from visual space to the embedding space is extremely important. However, the existing embedding approaches to ZSL mainly focus on aligning pairwise semantic consistency from heterogeneous spaces but ignore the intrinsic structure of the locally homogeneous isomorph. In order to preserve the locally visual structure in the embedding process, this paper proposes a Manifold regularized Cross-Modal Embedding (MCME) approach for ZSL by formulating the manifold constraint for intrinsic structure of the visual features as well as aligning pairwise consistency. The linear, closed-form solution makes MCME efficient to compute. Furthermore, rather than applying the embedding function learned from the seen classes directly, we also propose a new domain adaptation strategy to overcome the domain-shift problem during the knowledge transfer process. The MCME with the domain adaptation method is called MCME-DA. Extensive experiments on the benchmark datasets of AwA and CUB validate the superiority and promise of MCME and MCME-DA. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:48 / 58
页数:11
相关论文
共 50 条
  • [1] Cross-modal distribution alignment embedding network for generalized zero-shot learning
    Li, Qin
    Hou, Mingzhen
    Lai, Hong
    Yang, Ming
    NEURAL NETWORKS, 2022, 148 : 176 - 182
  • [2] Cross-modal Zero-shot Hashing
    Liu, Xuanwu
    Li, Zhao
    Wang, Jun
    Yu, Guoxian
    Domeniconi, Carlotta
    Zhang, Xiangliang
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 449 - 458
  • [3] Cross-modal Representation Learning for Zero-shot Action Recognition
    Lin, Chung-Ching
    Lin, Kevin
    Wang, Lijuan
    Liu, Zicheng
    Li, Linjie
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19946 - 19956
  • [4] Cross-modal propagation network for generalized zero-shot learning
    Guo, Ting
    Liang, Jianqing
    Liang, Jiye
    Xie, Guo-Sen
    PATTERN RECOGNITION LETTERS, 2022, 159 : 125 - 131
  • [5] Generalized Zero-Shot Cross-Modal Retrieval
    Dutta, Titir
    Biswas, Soma
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (12) : 5953 - 5962
  • [6] Learning Deep Cross-Modal Embedding Networks for Zero-Shot Remote Sensing Image Scene Classification
    Li, Yansheng
    Zhu, Zhihui
    Yu, Jin-Gang
    Zhang, Yongjun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (12): : 10590 - 10603
  • [7] Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification
    Fang, Zhiyu
    Zhu, Xiaobin
    Yang, Chun
    Han, Zheng
    Qin, Jingyan
    Yin, Xu-Cheng
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6605 - 6613
  • [8] DUET: Cross-Modal Semantic Grounding for Contrastive Zero-Shot Learning
    Chen, Zhuo
    Huang, Yufeng
    Chen, Jiaoyan
    Geng, Yuxia
    Zhang, Wen
    Fang, Yin
    Pan, Jeff Z.
    Chen, Huajun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 405 - 413
  • [9] Cross-modal prototype learning for zero-shot handwritten character recognition
    Ao, Xiang
    Zhang, Xu-Yao
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2022, 131
  • [10] A Cross-Modal Alignment for Zero-Shot Image Classification
    Wu, Lu
    Wu, Chenyu
    Guo, Han
    Zhao, Zhihao
    IEEE ACCESS, 2023, 11 : 9067 - 9073