Dual Prototype Contrastive Network for Generalized Zero-Shot Learning

被引：1

作者：

Jiang, Huajie ^{[1
]}

Li, Zhengxian ^{[1
]}

Hu, Yongli ^{[1
]}

Yin, Baocai ^{[1
]}

Yang, Jian ^{[2
]}

van den Hengel, Anton ^{[3
]}

Yang, Ming-Hsuan ^{[4
]}

Qi, Yuankai

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing Inst Artificial Intelligence, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

[2] Macquarie Univ, Sch Comp, Sydney, NSW 2109, Australia

[3] Univ Adelaide, Sch Comp Sci, Adelaide, SA 5000, Australia

[4] Univ Calif Merced, Dept Elect Engn & Comp Sci, Merced, CA 95343 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2025年 / 35卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Visualization; Semantics; Prototypes; Contrastive learning; Zero shot learning; Generative adversarial networks; Object recognition; Feature extraction; Training; Face recognition; Generalized zero-shot learning; prototype learning; contrastive learning;

D O I：

10.1109/TCSVT.2024.3474910

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Generalized zero-shot learning (GZSL) requires that models are able to recognize classes they were trained on, and new classes they haven't seen before. Feature-generation approaches are popular due to their effectiveness in mitigating overfitting to the training classes. Existing generative approaches usually adopt simple discriminators for distribution or classification supervision, however, thus limiting their ability to generate visual features that are discriminative of and transferable to novel categories. To overcome this limitation and improve the quality of generated features, we propose a dual prototype contrastive augmented discriminator for the generative adversarial network. Specifically, we design a Dual Prototype Contrastive Network (DPCN), which leverages complementary information between visual space and semantic space through multi-task prototype contrastive learning. Contrastive learning of the visual prototypes enhances the ability of the generated features to distinguish between classes, while the contrastive learning of the semantic prototypes improves their transferability. Furthermore, we introduce margins into the contrastive learning process to ensure both intra-class compactness and inter-class separation. To demonstrate the effectiveness of the proposed approach, we conduct experiments on three widely-used zero-shot learning benchmark datasets, where DPCN achieves state-of-the-art performance for GZSL.

引用

页码：1111 / 1122

页数：12

共 67 条

[1] Label-Embedding for Attribute-Based Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826

[2]

[Anonymous], 2013, 27 INT C NEUR INF PR

[3]

Arik SO, 2020, J MACH LEARN RES, V21

[4]

Arjovsky M, 2017, PR MACH LEARN RES, V70

[5] Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication [J].

Bucher, Maxime ;

Herbin, Stephane ;

Jurie, Frederic .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :730-746

[6] Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Sha, Fei .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3496-3505

[7] Synthesized Classifiers for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Gong, Boqing ;

Sha, Fei .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336

[8] An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild [J].

Chao, Wei-Lun ;

Changpinyo, Soravit ;

Gong, Boqing ;

Sha, Fei .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :52-68

[9]

Chen SM, 2021, ADV NEUR IN, V34

[10] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning [J].

Chen, Shiming ;

Hong, Ziming ;

Xie, Guo-Sen ;

Yang, Wenhan ;

Peng, Qinmu ;

Wang, Kai ;

Zhao, Jian ;

You, Xinge .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :7602-7611

← 1 2 3 4 5 6 7 →