Language-Augmented Pixel Embedding for Generalized Zero-Shot Learning

被引:11
|
作者
Wang, Ziyang [1 ,2 ]
Gou, Yunhao [1 ,2 ]
Li, Jingjing [2 ]
Zhu, Lei [3 ]
Shen, Heng Tao [3 ]
机构
[1] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Huzhou 313002, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Visualization; Task analysis; Feature extraction; Image recognition; Annotations; Knowledge transfer; Zero-shot learning; transfer learning; attention mechanism;
D O I
10.1109/TCSVT.2022.3208256
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Zero-shot Learning (ZSL) aims to recognize novel classes through seen knowledge. The canonical approach to ZSL leverages a visual-to-semantic embedding to map the global features of an image sample to its semantic representation. These global features usually overlook the fine-grained information which is vital for knowledge transfer between seen and unseen classes, rendering these features sub-optimal for ZSL task, especially the more realistic Generalized Zero-shot Learning (GZSL) task where global features of similar classes could hardly be separated. To provide a remedy to this problem, we propose Language-Augmented Pixel Embedding (LAPE) that directly bridges the visual and semantic spaces in a pixel-based manner. To this end, we map the local features of each pixel to different attributes and then extract each semantic attribute from the corresponding pixel. However, the lack of pixel-level annotation conduces to an inefficient pixel-based knowledge transfer. To mitigate this dilemma, we adopt the text information of each attribute to augment the local features of image pixels which are related to the semantic attributes. Experiments on four ZSL benchmarks demonstrate that LAPE outperforms current state-of-the-art methods. Comprehensive ablation studies and analyses are provided to dissect what factors lead to this success.
引用
收藏
页码:1019 / 1030
页数:12
相关论文
共 50 条
  • [21] A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning
    Rahman, Shafin
    Khan, Salman
    Porikli, Fatih
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5652 - 5667
  • [22] Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning
    Zhang, Jianyang
    Yang, Guowu
    Hu, Ping
    Lin, Guosheng
    Lv, Fengmao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4024 - 4035
  • [23] Generative Mixup Networks for Zero-Shot Learning
    Xu, Bingrong
    Zeng, Zhigang
    Lian, Cheng
    Ding, Zhengming
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022,
  • [24] Learning Multipart Attention Neural Network for Zero-Shot Classification
    Meng, Min
    Wei, Jie
    Wu, Jigang
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 414 - 423
  • [25] Augmented Multimodality Fusion for Generalized Zero-Shot Sketch-Based Visual Retrieval
    Jing, Taotao
    Xia, Haifeng
    Hamm, Jihun
    Ding, Zhengming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3657 - 3668
  • [26] Region interaction and attribute embedding for zero-shot learning
    Hu, Zhengwei
    Zhao, Haitao
    Peng, Jingchao
    Gu, Xiaojing
    INFORMATION SCIENCES, 2022, 609 : 984 - 995
  • [27] Fine-Grained Feature Generation for Generalized Zero-Shot Video Classification
    Hong, Mingyao
    Zhang, Xinfeng
    Li, Guorong
    Huang, Qingming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1599 - 1612
  • [28] ENCYCLOPEDIA ENHANCED SEMANTIC EMBEDDING FOR ZERO-SHOT LEARNING
    Jia, Zhen
    Zhang, Junge
    Huang, Kaiqi
    Tan, Tieniu
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1287 - 1291
  • [29] GNDAN: Graph Navigated Dual Attention Network for Zero-Shot Learning
    Chen, Shiming
    Hong, Ziming
    Xie, Guosen
    Peng, Qinmu
    You, Xinge
    Ding, Weiping
    Shao, Ling
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4516 - 4529
  • [30] Denoised and Dynamic Alignment Enhancement for Zero-Shot Learning
    Ge, Jiannan
    Liu, Zhihang
    Li, Pandeng
    Xie, Lingxi
    Zhang, Yongdong
    Tian, Qi
    Xie, Hongtao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1501 - 1515