Language-Augmented Pixel Embedding for Generalized Zero-Shot Learning

被引：11

作者：

Wang, Ziyang ^{[1
,2
]}

Gou, Yunhao ^{[1
,2
]}

Li, Jingjing ^{[2
]}

Zhu, Lei ^{[3
]}

Shen, Heng Tao ^{[3
]}

机构：

[1] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Huzhou 313002, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China

[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Semantics; Visualization; Task analysis; Feature extraction; Image recognition; Annotations; Knowledge transfer; Zero-shot learning; transfer learning; attention mechanism;

D O I：

10.1109/TCSVT.2022.3208256

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Zero-shot Learning (ZSL) aims to recognize novel classes through seen knowledge. The canonical approach to ZSL leverages a visual-to-semantic embedding to map the global features of an image sample to its semantic representation. These global features usually overlook the fine-grained information which is vital for knowledge transfer between seen and unseen classes, rendering these features sub-optimal for ZSL task, especially the more realistic Generalized Zero-shot Learning (GZSL) task where global features of similar classes could hardly be separated. To provide a remedy to this problem, we propose Language-Augmented Pixel Embedding (LAPE) that directly bridges the visual and semantic spaces in a pixel-based manner. To this end, we map the local features of each pixel to different attributes and then extract each semantic attribute from the corresponding pixel. However, the lack of pixel-level annotation conduces to an inefficient pixel-based knowledge transfer. To mitigate this dilemma, we adopt the text information of each attribute to augment the local features of image pixels which are related to the semantic attributes. Experiments on four ZSL benchmarks demonstrate that LAPE outperforms current state-of-the-art methods. Comprehensive ablation studies and analyses are provided to dissect what factors lead to this success.

引用

页码：1019 / 1030

页数：12

共 50 条

[21] A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning
Rahman, Shafin
Khan, Salman
Porikli, Fatih
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5652 - 5667
[22] Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning
Zhang, Jianyang
Yang, Guowu
Hu, Ping
Lin, Guosheng
Lv, Fengmao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4024 - 4035
[23] Generative Mixup Networks for Zero-Shot Learning
Xu, Bingrong
Zeng, Zhigang
Lian, Cheng
Ding, Zhengming
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022,
[24] Learning Multipart Attention Neural Network for Zero-Shot Classification
Meng, Min
Wei, Jie
Wu, Jigang
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 414 - 423
[25] Augmented Multimodality Fusion for Generalized Zero-Shot Sketch-Based Visual Retrieval
Jing, Taotao
Xia, Haifeng
Hamm, Jihun
Ding, Zhengming
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3657 - 3668
[26] Region interaction and attribute embedding for zero-shot learning
Hu, Zhengwei
Zhao, Haitao
Peng, Jingchao
Gu, Xiaojing
INFORMATION SCIENCES, 2022, 609 : 984 - 995
[27] Fine-Grained Feature Generation for Generalized Zero-Shot Video Classification
Hong, Mingyao
Zhang, Xinfeng
Li, Guorong
Huang, Qingming
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1599 - 1612
[28] ENCYCLOPEDIA ENHANCED SEMANTIC EMBEDDING FOR ZERO-SHOT LEARNING
Jia, Zhen
Zhang, Junge
Huang, Kaiqi
Tan, Tieniu
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1287 - 1291
[29] GNDAN: Graph Navigated Dual Attention Network for Zero-Shot Learning
Chen, Shiming
Hong, Ziming
Xie, Guosen
Peng, Qinmu
You, Xinge
Ding, Weiping
Shao, Ling
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4516 - 4529
[30] Denoised and Dynamic Alignment Enhancement for Zero-Shot Learning
Ge, Jiannan
Liu, Zhihang
Li, Pandeng
Xie, Lingxi
Zhang, Yongdong
Tian, Qi
Xie, Hongtao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1501 - 1515

← 1 2 3 4 5 →