A Cross-Modal Alignment for Zero-Shot Image Classification

被引:6
作者
Wu, Lu [1 ,2 ]
Wu, Chenyu [2 ]
Guo, Han [3 ]
Zhao, Zhihao [2 ]
机构
[1] Minist Nat Resources, Key Lab Urban Land Resources Monitoring & Simulat, Shenzhen 518000, Peoples R China
[2] Wuhan Univ Technol, Sch Informat Engn, Wuhan 430070, Peoples R China
[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China
关键词
Visualization; Semantics; Training data; Feature extraction; Object recognition; Monitoring; Image classification; Cross-modal alignment; zero-shot image classification; text attribute query; cosine similarity;
D O I
10.1109/ACCESS.2023.3237966
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Different from major classification methods based on large amounts of annotation data, we introduce a cross-modal alignment for zero-shot image classification.The key is utilizing the query of text attribute learned from the seen classes to guide local feature responses in unseen classes. First, an encoder is used to align semantic matching between visual features and their corresponding text attribute. Second, an attention module is used to get response maps through feature maps activated by the query of text attribute. Finally, the cosine distance metric is used to measure the matching degree of the text attribute and its corresponding feature response. The experiment results show that the method get better performance than existing Zero-shot Learning in embedding-based methods as well as other generative methods in CUB-200-2011 dataset.
引用
收藏
页码:9067 / 9073
页数:7
相关论文
共 37 条
[1]  
Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911
[2]   Preserving Semantic Relations for Zero-Shot Learning [J].
Annadani, Yashas ;
Biswas, Soma .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612
[3]  
[Anonymous], 2018, Advances in Neural Information Processing Systems
[4]   Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning [J].
Changpinyo, Soravit ;
Chao, Wei-Lun ;
Sha, Fei .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3496-3505
[5]   Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Liu, Wei ;
Chang, Shih-Fu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1043-1052
[6]   FREE: Feature Refinement for Generalized Zero-Shot Learning [J].
Chen, Shiming ;
Wang, Wenjie ;
Xia, Beihao ;
Peng, Qinmu ;
You, Xinge ;
Zheng, Feng ;
Shao, Ling .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :122-131
[7]  
Chen SZ, 2021, ADV NEUR IN, V34
[8]   Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention [J].
Dat Huynh ;
Elhamifar, Ehsan .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4482-4492
[9]  
Frome A., 2013, P NIPS, V26
[10]   Transductive Multi-View Zero-Shot Learning [J].
Fu, Yanwei ;
Hospedales, Timothy M. ;
Xiang, Tao ;
Gong, Shaogang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (11) :2332-2345