Dynamic visual-guided selection for zero-shot learning

被引:2
作者
Zhou, Yuan [1 ]
Xiang, Lei [1 ]
Liu, Fan [1 ]
Duan, Haoran [2 ]
Long, Yang [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Artificial Intelligence, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham, England
关键词
Visual-guided selection; Class prototype refinement; Task-relevant regions; Zero-shot learning;
D O I
10.1007/s11227-023-05625-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-shot learning (ZSL) methods currently employed to identify seen or unseen classes rely on semantic attribute prototypes or class information. However, hand-annotated attributes are only for the category rather than for each image belonging to that category. Furthermore, attribute information is inconsistent across different images of the same category due to variant views. Therefore, we propose a dynamic visual-guided selection (DVGS) which helps dynamically focus on different regions and refines class prototype on each image. Instead of directly aligning an image's global feature with its semantic class vector or its local features with all attribute vectors, the proposed method learns a vision-guided soft mask to refine the class prototype for each image. Additionally, it discovers the most task-relevant regions for fine-grained recognition with the refined class prototype. Extensive experiments on three benchmarks verify the effectiveness of our DVGS and achieve the new state-of-the-art. Our DVGS achieved the best results on fine-grained datasets within both the conventional zero-shot learning (CZSL) and generalized zero-shot learning (GZSL) settings. In particular, on the SUN dataset, our DVGS demonstrates a significant superiority of 10.2% in the CZSL setting compared with the second-best approach. Similarly, our method outperforms the second-best method by an average of 4% on CUB in both the CZSL and GZSL settings. Despite securing the second-best result on the AWA2 dataset, DVGS remains closely competitive, trailing the best performance by a mere 3.4% in CZSL and 1.2% in GZSL.
引用
收藏
页码:4401 / 4419
页数:19
相关论文
共 48 条
  • [1] Label-Embedding for Image Classification
    Akata, Zeynep
    Perronnin, Florent
    Harchaoui, Zaid
    Schmid, Cordelia
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) : 1425 - 1438
  • [2] Alamri F, 2021, ARXIV
  • [3] Brown TB, 2020, ADV NEUR IN, V33
  • [4] An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild
    Chao, Wei-Lun
    Changpinyo, Soravit
    Gong, Boqing
    Sha, Fei
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 52 - 68
  • [5] Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
    Chen, Long
    Zhang, Hanwang
    Xiao, Jun
    Liu, Wei
    Chang, Shih-Fu
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1043 - 1052
  • [6] Chen SM, 2021, ADV NEUR IN, V34
  • [7] Chen SM, 2022, AAAI CONF ARTIF INTE, P330
  • [8] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
    Chen, Shiming
    Hong, Ziming
    Xie, Guo-Sen
    Yang, Wenhan
    Peng, Qinmu
    Wang, Kai
    Zhao, Jian
    You, Xinge
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7602 - 7611
  • [9] FREE: Feature Refinement for Generalized Zero-Shot Learning
    Chen, Shiming
    Wang, Wenjie
    Xia, Beihao
    Peng, Qinmu
    You, Xinge
    Zheng, Feng
    Shao, Ling
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 122 - 131
  • [10] Chou Y.-Y., 2021, INT C LEARN REPR