Semantic Fusion and Contrastive Generation for Generalized Zero-Shot Learning

被引：0

作者：

Yang, Guan ^{[1
,2
,3
,4
]}

Sun, Weihao ^{[1
,2
,3
]}

Liu, Xiaoming ^{[2
,3
,4
]}

Liu, Yang ^{[5
]}

Wang, Chen ^{[1
,2
,3
]}

机构：

[1] Zhongyuan Univ Technol, Sch Artificial Intelligence, Zhengzhou 450007, Henan, Peoples R China

[2] Zhongyuan Univ Technol, Sch Comp Sci, Zhengzhou 450007, Henan, Peoples R China

[3] Zhongyuan Univ Technol, Zhengzhou Key Lab Text Proc & Image Understanding, Zhengzhou 450007, Henan, Peoples R China

[4] Zhongyuan Univ Technol, Henan Key Lab Publ Opin Intelligent Anal, Zhengzhou 450007, Henan, Peoples R China

[5] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL | 2025年 / 14卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Generalized Zero-Shot Learning; Semantic Fusion; Image Generation; Image Classification; Zero-Shot Retrieval;

D O I：

10.1007/s13735-025-00372-w

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generalized Zero-Shot Learning (GZSL) aims to leverage a classifier trained on seen classes to categorize instances from both seen and unseen classes. Several approaches have been introduced to synthesize visual features that simulate those of unseen classes for training classifiers. However, existing methods only emphasize the distributional relationships between synthesized and real features, while neglecting the inter-class relationships among the synthesized features. Consequently, synthesized visual features exhibit significant loose intra-class distributions and numerous outliers. Furthermore, the generator trained solely on seen classes tend to overfit these classes. In this paper, a Semantic Fusion and Contrastive Generation (SFCG) framework is proposed for GZSL. Specifically, a visual-semantic contrastive generation method and a visual features similarity loss are explored to address the challenges of loose intra-class distribution and outliers in synthesized visual features. Moreover, semantic attributes are fused to create novel and diverse semantic instances for training a balanced generator. The SFCG model is evaluated on four widely-used ZSL benchmark datasets: CUB, FLO, AWA2, and SUN. It achieves harmonic mean accuracies of 68.4% on CUB, 71.3% on FLO, 73.4% on AWA2, and 45.2% on SUN, demonstrating the efficacy of the proposed method.

引用

页数：11

共 48 条

[1]

Alemi Alexander, 2018, INT C MACHINE LEARNI, V80, P159

[2] Preserving Semantic Relations for Zero-Shot Learning [J].

Annadani, Yashas ;

Biswas, Soma .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612

[3]

[Anonymous], 2013, 27 INT C NEUR INF PR

[4]

Branson S., 2010, Technical Report CNS-TR-2010-001

[5]

Chen S, 2021, arXiv

[6] Rethinking attribute localization for zero-shot learning [J].

Chen, Shuhuang ;

Chen, Shiming ;

Xie, Guo-Sen ;

Shu, Xiangbo ;

You, Xinge ;

Li, Xuelong .

SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (07)

[7] Destruction and Construction Learning for Fine-grained Image Recognition [J].

Chen, Yue ;

Bai, Yalong ;

Zhang, Wei ;

Mei, Tao .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5152-5161

[8]

Chen Z, 2021, Arxiv, DOI arXiv:2101.07978

[9]

Chen Z, 2019, Arxiv, DOI arXiv:1909.09822

[10] Transfer Increment for Generalized Zero-Shot Learning [J].

Feng, Liangjun ;

Zhao, Chunhui .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) :2506-2520

← 1 2 3 4 5 →