A Cross-Modal Alignment for Zero-Shot Image Classification

被引：5

作者：

Wu, Lu ^{[1
,2
]}

Wu, Chenyu ^{[2
]}

Guo, Han ^{[3
]}

Zhao, Zhihao ^{[2
]}

机构：

[1] Minist Nat Resources, Key Lab Urban Land Resources Monitoring & Simulat, Shenzhen 518000, Peoples R China

[2] Wuhan Univ Technol, Sch Informat Engn, Wuhan 430070, Peoples R China

[3] Wuhan Univ, Sch Resource & Environm Sci, Wuhan 430079, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Visualization; Semantics; Training data; Feature extraction; Object recognition; Monitoring; Image classification; Cross-modal alignment; zero-shot image classification; text attribute query; cosine similarity;

D O I：

10.1109/ACCESS.2023.3237966

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Different from major classification methods based on large amounts of annotation data, we introduce a cross-modal alignment for zero-shot image classification.The key is utilizing the query of text attribute learned from the seen classes to guide local feature responses in unseen classes. First, an encoder is used to align semantic matching between visual features and their corresponding text attribute. Second, an attention module is used to get response maps through feature maps activated by the query of text attribute. Finally, the cosine distance metric is used to measure the matching degree of the text attribute and its corresponding feature response. The experiment results show that the method get better performance than existing Zero-shot Learning in embedding-based methods as well as other generative methods in CUB-200-2011 dataset.

引用

页码：9067 / 9073

页数：7

共 50 条

[41] Triplet Bridge for Zero-Shot Sketch-Based Image Retrieval
Zheng, Jiahao
Tang, Yu
Wu, Dapeng
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
[42] Area-keywords cross-modal alignment for referring image segmentation
Zhang, Huiyong
Wang, Lichun
Li, Shuang
Xu, Kai
Yin, Baocai
NEUROCOMPUTING, 2024, 581
[43] A Deep Multi-Modal Explanation Model for Zero-Shot Learning
Liu, Yu
Tuytelaars, Tinne
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4788 - 4803
[44] Category Alignment Adversarial Learning for Cross-Modal Retrieval
He, Shiyuan
Wang, Weiyang
Wang, Zheng
Xu, Xing
Yang, Yang
Wang, Xiaoming
Shen, Heng Tao
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
[45] Deconfounding Causal Inference for Zero-Shot Action Recognition
Wang, Junyan
Jiang, Yiqi
Long, Yang
Sun, Xiuyu
Pagnucco, Maurice
Song, Yang
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3976 - 3986
[46] Deep Unbiased Embedding Transfer for Zero-Shot Learning
Jia, Zhen
Zhang, Zhang
Wang, Liang
Shan, Caifeng
Tan, Tieniu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1958 - 1971
[47] An Inverse Mapping with Manifold Alignment for Zero-Shot Learning
Wu, Xixun
Song, Binheng
Wang, Zhixiang
Yuan, Chun
MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 400 - 411
[48] Generative Zero-Shot Compound Fault Diagnosis Based on Semantic Alignment
Xu, Juan
Kong, Hui
Li, Kang
Ding, Xu
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 13
[49] Fine-Grained Feature Generation for Generalized Zero-Shot Video Classification
Hong, Mingyao
Zhang, Xinfeng
Li, Guorong
Huang, Qingming
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1599 - 1612
[50] A Zero-Shot Image Classification Method Based on Subspace Learning with the Fusion of Reconstruction
Zhao P.
Wang C.-Y.
Zhang S.-Y.
Liu Z.-Y.
Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 409 - 421

← 1 2 3 4 5 →