Contrastive embedding and structural alignment for generalized zero-shot learning

被引：0

作者：

He, Chunmei ^{[1
]}

Tang, Jing ^{[1
]}

Ye, Zhengchun ^{[2
]}

Zhou, Kang ^{[1
]}

Wu, Shengyu ^{[1
]}

机构：

[1] Xiangtan Univ, Sch Comp Sci, Xiangtan 411105, Hunan, Peoples R China

[2] Xiangtan Univ, Sch Mech Engn & Mech, Xiangtan 411105, Hunan, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2025年 / 179卷

关键词：

Zero-shot learning; Image classification; Deep learning; Embedding generation; Structural alignment; ADVERSARIAL NETWORK;

D O I：

10.1016/j.asoc.2025.113376

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generalized zero-shot learning (GZSL) aims to efficiently transfer knowledge from seen to unseen classes by learning semantic attributes and visual features. However, previous works mainly suffer from two limitations. 1) Due to the lack of unseen class samples in the training process, the embedding method is faced with serious domain shift problem, which leads to the prediction bias toward the seen classes; 2) The generation method usually lacks effective constraints in the process of sample generation, and does not consider the spatial structure consistency of semantic attributes and visual features, causing the generated features lack discriminative information. In order to overcome these limitations, a generative GZSL method: Contrastive Embedding and Structural Alignment Model (CESAM) is proposed in this paper. Specifically, we firstly present the contrastive embedding module, design the contrastive loss in embedding space to perform visual-level supervision and semantic-level supervision for GZSL and promote the model to construct more accurate and discriminative embedding space. Secondly, we present the structural alignment module to bridge the correlation information lost in contrastive learning, and enable the proposed method to maintain spatial structure consistency between semantic attributes and visual features, and further optimize the learning of feature generator with reconstruction loss. Furthermore, we design the integrated GZSL module to integrate the embedding module with the constrained generative module, construct the collaborative learning between the embedding module and the constrained generative module to learn a more discriminative and generalizable model. At last, extensive experimental evaluations on four datasets demonstrate that CESAM performs state-of-the-art performance.

引用

页数：16

共 55 条

[1]

[Anonymous], 2013, 27 INT C NEUR INF PR

[2]

[Anonymous], 2009, P 22 INT C NEUR INF

[3]

[Anonymous], 2011, CALTECH UCSD BIRDS 2

[4] Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions [J].

Ba, Jimmy Lei ;

Swersky, Kevin ;

Fidler, Sanja ;

Salakhutdinov, Ruslan .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4247-4255

[5] Synthesized Classifiers for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Gong, Boqing ;

Sha, Fei .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336

[6]

Chen SM, 2022, AAAI CONF ARTIF INTE, P330

[7] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning [J].

Chen, Shiming ;

Hong, Ziming ;

Xie, Guo-Sen ;

Yang, Wenhan ;

Peng, Qinmu ;

Wang, Kai ;

Zhao, Jian ;

You, Xinge .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :7602-7611

[8]

Chen T., 2020, PMLR, P1597

[9] Semantics Disentangling for Generalized Zero-Shot Learning [J].

Chen, Zhi ;

Luo, Yadan ;

Qiu, Ruihong ;

Wang, Sen ;

Huang, Zi ;

Li, Jingjing ;

Zhang, Zheng .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :8692-8700

[10] Hybrid routing transformer for zero-shot learning [J].

Cheng, De ;

Wang, Gerong ;

Wang, Bo ;

Zhang, Qiang ;

Han, Jungong ;

Zhang, Dingwen .

PATTERN RECOGNITION, 2023, 137

← 1 2 3 4 5 6 →