A Cross-Modal Alignment for Zero-Shot Image Classification

被引:5
|
作者
Wu, Lu [1 ,2 ]
Wu, Chenyu [2 ]
Guo, Han [3 ]
Zhao, Zhihao [2 ]
机构
[1] Minist Nat Resources, Key Lab Urban Land Resources Monitoring & Simulat, Shenzhen 518000, Peoples R China
[2] Wuhan Univ Technol, Sch Informat Engn, Wuhan 430070, Peoples R China
[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China
关键词
Visualization; Semantics; Training data; Feature extraction; Object recognition; Monitoring; Image classification; Cross-modal alignment; zero-shot image classification; text attribute query; cosine similarity;
D O I
10.1109/ACCESS.2023.3237966
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Different from major classification methods based on large amounts of annotation data, we introduce a cross-modal alignment for zero-shot image classification.The key is utilizing the query of text attribute learned from the seen classes to guide local feature responses in unseen classes. First, an encoder is used to align semantic matching between visual features and their corresponding text attribute. Second, an attention module is used to get response maps through feature maps activated by the query of text attribute. Finally, the cosine distance metric is used to measure the matching degree of the text attribute and its corresponding feature response. The experiment results show that the method get better performance than existing Zero-shot Learning in embedding-based methods as well as other generative methods in CUB-200-2011 dataset.
引用
收藏
页码:9067 / 9073
页数:7
相关论文
共 50 条
  • [41] Cross-Domain Few-Shot Hyperspectral Image Classification With Cross-Modal Alignment and Supervised Contrastive Learning
    Li, Zhaokui
    Zhang, Chenyang
    Wang, Yan
    Li, Wei
    Du, Qian
    Fang, Zhuoqun
    Chen, Yushi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 19
  • [42] Two-stage zero-shot sparse hashing with missing labels for cross-modal retrieval
    Yong, Kailing
    Shu, Zhenqiu
    Wang, Hongbin
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 155
  • [43] Gaze Embeddings for Zero-Shot Image Classification
    Karessli, Nour
    Akata, Zeynep
    Schiele, Bernt
    Bulling, Andreas
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6412 - 6421
  • [44] Multimodal Ensembling for Zero-Shot Image Classification
    Hickmon, Javon
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23747 - 23749
  • [45] Zero-Shot Image Classification Based on Attribute
    Zhang, Wei
    Chen, Wenbai
    Chen, Xiangfeng
    Han, Hu
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 25 - 30
  • [46] CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning
    Huang, Haojian
    Qiao, Xiaozhennn
    Chen, Zhuo
    Chen, Haodong
    Li, Bingyu
    Sun, Zhe
    Chen, Mulin
    Li, Xuelong
    MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia, : 5181 - 5190
  • [47] Attribute-Guided Multiple Instance Hashing Network for Cross-Modal Zero-Shot Hashing
    Song, Lingyun
    Shang, Xuequn
    Yang, Chen
    Sun, Mingxuan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5305 - 5318
  • [48] Method for improving zero-shot image classification
    Chen, Xiangfeng
    Chen, Wenbai
    Zhang, Chong
    Lv, Mengyao
    Han, Hu
    JOURNAL OF ENGINEERING-JOE, 2018, (16): : 1688 - 1691
  • [49] CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification
    Sinha, Sankalp
    Khan, Muhammad Saif Ullah
    Sheikh, Talha Uddin
    Stricker, Didier
    Afzal, Muhammad Zeshan
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV, 2024, 14807 : 124 - 141
  • [50] Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification
    Li, Yansheng
    Kong, Deyu
    Zhang, Yongjun
    Tan, Yihua
    Chen, Ling
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 179 : 145 - 158