A Cross-Modal Alignment for Zero-Shot Image Classification

被引:5
|
作者
Wu, Lu [1 ,2 ]
Wu, Chenyu [2 ]
Guo, Han [3 ]
Zhao, Zhihao [2 ]
机构
[1] Minist Nat Resources, Key Lab Urban Land Resources Monitoring & Simulat, Shenzhen 518000, Peoples R China
[2] Wuhan Univ Technol, Sch Informat Engn, Wuhan 430070, Peoples R China
[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China
关键词
Visualization; Semantics; Training data; Feature extraction; Object recognition; Monitoring; Image classification; Cross-modal alignment; zero-shot image classification; text attribute query; cosine similarity;
D O I
10.1109/ACCESS.2023.3237966
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Different from major classification methods based on large amounts of annotation data, we introduce a cross-modal alignment for zero-shot image classification.The key is utilizing the query of text attribute learned from the seen classes to guide local feature responses in unseen classes. First, an encoder is used to align semantic matching between visual features and their corresponding text attribute. Second, an attention module is used to get response maps through feature maps activated by the query of text attribute. Finally, the cosine distance metric is used to measure the matching degree of the text attribute and its corresponding feature response. The experiment results show that the method get better performance than existing Zero-shot Learning in embedding-based methods as well as other generative methods in CUB-200-2011 dataset.
引用
收藏
页码:9067 / 9073
页数:7
相关论文
共 50 条
  • [41] Triplet Bridge for Zero-Shot Sketch-Based Image Retrieval
    Zheng, Jiahao
    Tang, Yu
    Wu, Dapeng
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [42] Area-keywords cross-modal alignment for referring image segmentation
    Zhang, Huiyong
    Wang, Lichun
    Li, Shuang
    Xu, Kai
    Yin, Baocai
    NEUROCOMPUTING, 2024, 581
  • [43] A Deep Multi-Modal Explanation Model for Zero-Shot Learning
    Liu, Yu
    Tuytelaars, Tinne
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4788 - 4803
  • [44] Category Alignment Adversarial Learning for Cross-Modal Retrieval
    He, Shiyuan
    Wang, Weiyang
    Wang, Zheng
    Xu, Xing
    Yang, Yang
    Wang, Xiaoming
    Shen, Heng Tao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
  • [45] Deconfounding Causal Inference for Zero-Shot Action Recognition
    Wang, Junyan
    Jiang, Yiqi
    Long, Yang
    Sun, Xiuyu
    Pagnucco, Maurice
    Song, Yang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3976 - 3986
  • [46] Deep Unbiased Embedding Transfer for Zero-Shot Learning
    Jia, Zhen
    Zhang, Zhang
    Wang, Liang
    Shan, Caifeng
    Tan, Tieniu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1958 - 1971
  • [47] An Inverse Mapping with Manifold Alignment for Zero-Shot Learning
    Wu, Xixun
    Song, Binheng
    Wang, Zhixiang
    Yuan, Chun
    MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 400 - 411
  • [48] Generative Zero-Shot Compound Fault Diagnosis Based on Semantic Alignment
    Xu, Juan
    Kong, Hui
    Li, Kang
    Ding, Xu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 13
  • [49] Fine-Grained Feature Generation for Generalized Zero-Shot Video Classification
    Hong, Mingyao
    Zhang, Xinfeng
    Li, Guorong
    Huang, Qingming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1599 - 1612
  • [50] A Zero-Shot Image Classification Method Based on Subspace Learning with the Fusion of Reconstruction
    Zhao P.
    Wang C.-Y.
    Zhang S.-Y.
    Liu Z.-Y.
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 409 - 421