Dynamic visual-guided selection for zero-shot learning

被引：2

作者：

Zhou, Yuan ^{[1
]}

Xiang, Lei ^{[1
]}

Liu, Fan ^{[1
]}

Duan, Haoran ^{[2
]}

Long, Yang ^{[2
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Jiangsu, Peoples R China

[2] Univ Durham, Dept Comp Sci, Durham, England

来源：

JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 03期

关键词：

Visual-guided selection; Class prototype refinement; Task-relevant regions; Zero-shot learning;

D O I：

10.1007/s11227-023-05625-1

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Zero-shot learning (ZSL) methods currently employed to identify seen or unseen classes rely on semantic attribute prototypes or class information. However, hand-annotated attributes are only for the category rather than for each image belonging to that category. Furthermore, attribute information is inconsistent across different images of the same category due to variant views. Therefore, we propose a dynamic visual-guided selection (DVGS) which helps dynamically focus on different regions and refines class prototype on each image. Instead of directly aligning an image's global feature with its semantic class vector or its local features with all attribute vectors, the proposed method learns a vision-guided soft mask to refine the class prototype for each image. Additionally, it discovers the most task-relevant regions for fine-grained recognition with the refined class prototype. Extensive experiments on three benchmarks verify the effectiveness of our DVGS and achieve the new state-of-the-art. Our DVGS achieved the best results on fine-grained datasets within both the conventional zero-shot learning (CZSL) and generalized zero-shot learning (GZSL) settings. In particular, on the SUN dataset, our DVGS demonstrates a significant superiority of 10.2% in the CZSL setting compared with the second-best approach. Similarly, our method outperforms the second-best method by an average of 4% on CUB in both the CZSL and GZSL settings. Despite securing the second-best result on the AWA2 dataset, DVGS remains closely competitive, trailing the best performance by a mere 3.4% in CZSL and 1.2% in GZSL.

引用

页码：4401 / 4419

页数：19

共 48 条

[1] Label-Embedding for Image Classification
Akata, Zeynep
Perronnin, Florent
Harchaoui, Zaid
Schmid, Cordelia
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) : 1425 - 1438
[2] Alamri F, 2021, ARXIV
[3] Brown TB, 2020, ADV NEUR IN, V33
[4] An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild
Chao, Wei-Lun
Changpinyo, Soravit
Gong, Boqing
Sha, Fei
[J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 52 - 68
[5] Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
Chen, Long
Zhang, Hanwang
Xiao, Jun
Liu, Wei
Chang, Shih-Fu
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1043 - 1052
[6] Chen SM, 2021, ADV NEUR IN, V34
[7] Chen SM, 2022, AAAI CONF ARTIF INTE, P330
[8] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
Chen, Shiming
Hong, Ziming
Xie, Guo-Sen
Yang, Wenhan
Peng, Qinmu
Wang, Kai
Zhao, Jian
You, Xinge
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7602 - 7611
[9] FREE: Feature Refinement for Generalized Zero-Shot Learning
Chen, Shiming
Wang, Wenjie
Xia, Beihao
Peng, Qinmu
You, Xinge
Zheng, Feng
Shao, Ling
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 122 - 131
[10] Chou Y.-Y., 2021, INT C LEARN REPR

← 1 2 3 4 5 →