Manifold regularized cross-modal embedding for zero-shot learning

被引:32
|
作者
Ji, Zhong [1 ]
Yu, Yunlong [1 ]
Pang, Yanwei [1 ]
Guo, Jichang [1 ]
Zhang, Zhongfei [2 ]
机构
[1] Tianjin Univ, Sch Elect Informat Engn, Tianjin 300072, Peoples R China
[2] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
基金
中国国家自然科学基金;
关键词
Zero-shot learning; Image classification; Cross-modal embedding; Manifold; Domain adaptation; RECOGNITION;
D O I
10.1016/j.ins.2016.10.025
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-Shot Learning (ZSL) aims at classifying previously unseen class samples and has gained its popularity in applications where samples of some categories are scarce for training. The basic idea to address this issue is transferring knowledge from the seen classes to the unseen classes through mapping the visual feature to an embedding space spanned by class semantic information. The class semantic information can be obtained from human labeled attributes or text corpus in an unsupervised fashion. Therefore, the embedding function from visual space to the embedding space is extremely important. However, the existing embedding approaches to ZSL mainly focus on aligning pairwise semantic consistency from heterogeneous spaces but ignore the intrinsic structure of the locally homogeneous isomorph. In order to preserve the locally visual structure in the embedding process, this paper proposes a Manifold regularized Cross-Modal Embedding (MCME) approach for ZSL by formulating the manifold constraint for intrinsic structure of the visual features as well as aligning pairwise consistency. The linear, closed-form solution makes MCME efficient to compute. Furthermore, rather than applying the embedding function learned from the seen classes directly, we also propose a new domain adaptation strategy to overcome the domain-shift problem during the knowledge transfer process. The MCME with the domain adaptation method is called MCME-DA. Extensive experiments on the benchmark datasets of AwA and CUB validate the superiority and promise of MCME and MCME-DA. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:48 / 58
页数:11
相关论文
共 50 条
  • [21] A Simplified Framework for Zero-shot Cross-Modal Sketch Data Retrieval
    Chaudhuri, Ushasi
    Banerjee, Biplab
    Bhattacharya, Avik
    Datcu, Mihai
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 699 - 706
  • [22] Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval
    Xu, Xing
    Lin, Kaiyi
    Lu, Huimin
    Gao, Lianli
    Shen, Heng Tao
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1419 - 1428
  • [23] Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval
    Yang, Fan
    Wang, Zheng
    Xiao, Jing
    Satoh, Shin'chi
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12589 - 12596
  • [24] Discrete asymmetric zero-shot hashing with application to cross-modal retrieval
    Shu, Zhenqiu
    Yong, Kailing
    Yu, Jun
    Gao, Shengxiang
    Mao, Cunli
    Yu, Zhengtao
    NEUROCOMPUTING, 2022, 511 : 366 - 379
  • [25] Attribute-Guided Network for Cross-Modal Zero-Shot Hashing
    Ji, Zhong
    Sun, Yuxin
    Yu, Yunlong
    Pang, Yanwei
    Han, Jungong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (01) : 321 - 330
  • [26] Multimodal Disentanglement Variational AutoEncoders for Zero-Shot Cross-Modal Retrieval
    Tian, Jialin
    Wang, Kai
    Xu, Xing
    Cao, Zuo
    Shen, Fumin
    Shen, Heng Tao
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 960 - 969
  • [27] Generalised Zero-shot Learning with Multi-modal Embedding Spaces
    Felix, Rafael
    Sasdelli, Michele
    Harwood, Ben
    Carneiro, Gustavo
    2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
  • [28] Zero-shot discrete hashing with adaptive class correlation for cross-modal retrieval
    Yong, Kailing
    Shu, Zhenqiu
    Yu, Jun
    Yu, Zhengtao
    KNOWLEDGE-BASED SYSTEMS, 2024, 295
  • [29] Zero-Shot Cross-Modal Retrieval for Remote Sensing Images With Minimal Supervision
    Chaudhuri, Ushasi
    Bose, Rupak
    Banerjee, Biplab
    Bhattacharya, Avik
    Datcu, Mihai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [30] Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
    Duquenne, Paul-Ambroise
    Schwenk, Holger
    Sagot, Benoit
    INTERSPEECH 2023, 2023, : 32 - 36