Dual Projective Zero-Shot Learning Using Text Descriptions

被引：7

作者：

Rao, Yunbo ^{[1
]}

Yang, Ziqiang ^{[1
]}

Zeng, Shaoning ^{[2
]}

Wang, Qifeng ^{[3
]}

Pu, Jiansu ^{[4
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, 4,Sect 2,North Jianshe Rd, Chengdu 610054, Sichuan, Peoples R China

[2] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Chengdu 313000, Sichuan, Peoples R China

[3] Google Berkeley, Berkeley, CA 94720 USA

[4] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, 4,Sect 2,North Jianshe Rd, Chengdu 610054, Sichuan, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2023年 / 19卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Zero-shot learning; generalized zero-shot learning; autoencoder; inductive zero-shot learning;

D O I：

10.1145/3514247

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Zero-shot learning (ZSL) aims to recognize image instances of unseen classes solely based on the semantic descriptions of the unseen classes. In this field, Generalized Zero-Shot Learning (GZSL) is a challenging problem in which the images of both seen and unseen classes are mixed in the testing phase of learning. Existing methods formulate GZSL as a semantic-visual correspondence problem and apply generative models such as Generative Adversarial Networks and Variational Autoencoders to solve the problem. However, these methods suffer from the bias problem since the images of unseen classes are often misclassified into seen classes. In this work, a novel model named the Dual Projective model for Zero-Shot Learning (DPZSL) is proposed using text descriptions. In order to alleviate the bias problem, we leverage two autoencoders to project the visual and semantic features into a latent space and evaluate the embeddings by a visual-semantic correspondence loss function. An additional novel classifier is also introduced to ensure the discriminability of the embedded features. Our method focuses on a more challenging inductive ZSL setting in which only the labeled data from seen classes are used in the training phase. The experimental results, obtained from two popular datasets-Caltech-UCSD Birds-200-2011 (CUB) and North America Birds (NAB)-show that the proposed DPZSL model significantly outperforms both the inductive ZSL and GZSL settings. Particularly in the GZSL setting, our model yields an improvement up to 15.2% in comparison with state-of-the-art CANZSL on datasets CUB and NAB with two splittings.

引用

页数：17

共 50 条

[1] Integrating topology beyond descriptions for zero-shot learning
Chen, Ziyi
Gao, Yutong
Lang, Congyan
Wei, Lili
Li, Yidong
Liu, Hongzhe
Liu, Fayao
PATTERN RECOGNITION, 2023, 143
[2] Dual insurance for generalized zero-shot learning
Liang, Jiahao
Fang, Xiaozhao
Kang, Peipei
Han, Na
Li, Chuang
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (03) : 2111 - 2125
[3] Dual Prototype Contrastive Network for Generalized Zero-Shot Learning
Jiang, Huajie
Li, Zhengxian
Hu, Yongli
Yin, Baocai
Yang, Jian
van den Hengel, Anton
Yang, Ming-Hsuan
Qi, Yuankai
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1111 - 1122
[4] A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning
Rahman, Shafin
Khan, Salman
Porikli, Fatih
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5652 - 5667
[5] Dual Part Discovery Network for Zero-Shot Learning
Ge, Jiannan
Xie, Hongtao
Min, Shaobo
Li, Pandeng
Zhang, Yongdong
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3244 - 3252
[6] Dual triplet network for image zero-shot learning
Ji, Zhong
Wang, Hai
Pang, Yanwei
Shao, Ling
NEUROCOMPUTING, 2020, 373 : 90 - 97
[7] A Dual Discriminator Method for Generalized Zero-Shot Learning
Wei, Tianshu
Huang, Jinjie
CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 79 (01): : 1599 - 1612
[8] Dual-verification network for zero-shot learning
Zhang, Haofeng
Long, Yang
Yang, Wankou
Shao, Ling
INFORMATION SCIENCES, 2019, 470 : 43 - 57
[9] Approaching Zero-shot Learning from a Text-to-Image GAN perspective
Talkani, Ayman
Bhojan, Anand
2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
[10] Generalized Zero-Shot Learning using Identifiable Variational Autoencoders
Gull, Muqaddas
Arif, Omar
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191

← 1 2 3 4 5 →