Contrastive embedding and structural alignment for generalized zero-shot learning

被引：0

作者：

He, Chunmei ^{[1
]}

Tang, Jing ^{[1
]}

Ye, Zhengchun ^{[2
]}

Zhou, Kang ^{[1
]}

Wu, Shengyu ^{[1
]}

机构：

[1] Xiangtan Univ, Sch Comp Sci, Xiangtan 411105, Hunan, Peoples R China

[2] Xiangtan Univ, Sch Mech Engn & Mech, Xiangtan 411105, Hunan, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2025年 / 179卷

关键词：

Zero-shot learning; Image classification; Deep learning; Embedding generation; Structural alignment; ADVERSARIAL NETWORK;

D O I：

10.1016/j.asoc.2025.113376

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generalized zero-shot learning (GZSL) aims to efficiently transfer knowledge from seen to unseen classes by learning semantic attributes and visual features. However, previous works mainly suffer from two limitations. 1) Due to the lack of unseen class samples in the training process, the embedding method is faced with serious domain shift problem, which leads to the prediction bias toward the seen classes; 2) The generation method usually lacks effective constraints in the process of sample generation, and does not consider the spatial structure consistency of semantic attributes and visual features, causing the generated features lack discriminative information. In order to overcome these limitations, a generative GZSL method: Contrastive Embedding and Structural Alignment Model (CESAM) is proposed in this paper. Specifically, we firstly present the contrastive embedding module, design the contrastive loss in embedding space to perform visual-level supervision and semantic-level supervision for GZSL and promote the model to construct more accurate and discriminative embedding space. Secondly, we present the structural alignment module to bridge the correlation information lost in contrastive learning, and enable the proposed method to maintain spatial structure consistency between semantic attributes and visual features, and further optimize the learning of feature generator with reconstruction loss. Furthermore, we design the integrated GZSL module to integrate the embedding module with the constrained generative module, construct the collaborative learning between the embedding module and the constrained generative module to learn a more discriminative and generalizable model. At last, extensive experimental evaluations on four datasets demonstrate that CESAM performs state-of-the-art performance.

引用

页数：16

共 55 条

[31]

Lampert CH, 2009, PROC CVPR IEEE, P951, DOI 10.1109/CVPRW.2009.5206594

[32] ESE-GAN: Zero-Shot Food Image Classification Based on Low Dimensional Embedding of Visual Features [J].

Li, Gaojie ;

Li, Yaochen ;

Liu, Jingle ;

Guo, Wei ;

Tang, Wenneng ;

Liu, Yuehu .

IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 :2713-2723

[33] Diversity-Boosted Generalization-Specialization Balancing for Zero-Shot Learning [J].

Li, Yun ;

Liu, Zhe ;

Chang, Xiaojun ;

McAuley, Julian ;

Yao, Lina .

IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :8372-8382

[34] Learning discriminative and representative feature with cascade GAN for generalized zero-shot learning [J].

Liu, Jingren ;

Fu, Liyong ;

Zhang, Haofeng ;

Ye, Qiaolin ;

Yang, Wankou ;

Liu, Li .

KNOWLEDGE-BASED SYSTEMS, 2022, 236

[35] Dual VAEGAN: A generative model for generalized zero-shot learning [J].

Luo, Yuxuan ;

Wang, Xizhao ;

Pourpanah, Farhad .

APPLIED SOFT COMPUTING, 2021, 107

[36]

Mikolov Tomas, 2013, NeurIPS, P3111

[37] A Review of Generalized Zero-Shot Learning Methods [J].

Pourpanah, Farhad ;

Abdar, Moloud ;

Luo, Yuxuan ;

Zhou, Xinlei ;

Wang, Ran ;

Lim, Chee Peng ;

Wang, Xi-Zhao ;

Wu, Q. M. Jonathan .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) :4051-4070

[38] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].

Ren, Shaoqing ;

He, Kaiming ;

Girshick, Ross ;

Sun, Jian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149

[39]

Romera-Paredes B, 2015, PR MACH LEARN RES, V37, P2152

[40] Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders [J].

Schonfeld, Edgar ;

Ebrahimi, Sayna ;

Sinha, Samarth ;

Darrell, Trevor ;

Akata, Zeynep .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8239-8247

← 1 2 3 4 5 6 →