Region Semantically Aligned Network for Zero-Shot Learning

被引：7

作者：

Wang, Ziyang ^{[1
]}

Gou, Yunhao ^{[1
]}

Li, Jingjing ^{[1
]}

Zhang, Yu ^{[2
]}

Yang, Yang ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China

[2] Southern Univ Sci & Technol, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021 | 2021年

基金：

中国国家自然科学基金;

关键词：

zero-shot learning; transfer learning; multimodal learning;

D O I：

10.1145/3459637.3482471

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Zero-shot learning (ZSL) aims to recognize unseen classes based on the knowledge of seen classes. Previous methods focused on learning direct embeddings from global features to the semantic space in hope of knowledge transfer from seen classes to unseen classes. However, an unseen class shares local visual features with a set of seen classes and leveraging global visual features makes the knowledge transfer ineffective. To tackle this problem, we propose a Region Semantically Aligned Network (RSAN), which maps local features of unseen classes to their semantic attributes. Instead of using global features which are obtained by an average pooling layer after an image encoder, we directly utilize the output of the image encoder which maintains local information of the image. Concretely, we obtain each attribute from a specific region of the output and exploit these attributes for recognition. As a result, the knowledge of seen classes can be successfully transferred to unseen classes in a region-bases manner. In addition, we regularize the image encoder through attribute regression with a semantic knowledge to extract robust and attribute-related visual features. Experiments on several standard ZSL datasets reveal the benefit of the proposed RSAN method, outperforming state-of-the-art methods.

引用

页码：2080 / 2090

页数：11

共 67 条

[1] Label-Embedding for Image Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) :1425-1438

[2]

Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911

[3] Label-Embedding for Attribute-Based Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826

[4]

[Anonymous], 2021, IEEE T, DOI DOI 10.1109/DS-RT52167.2021.9576129

[5]

[Anonymous], 2018, Advances in Neural Information Processing Systems

[6] Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Sha, Fei .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3496-3505

[7] Synthesized Classifiers for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Gong, Boqing ;

Sha, Fei .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336

[8]

Chao Wei-Lun, 2016, ABS160504253 ARXIV

[9]

Chen X., 2020, ECCV

[10]

Chen Z, 2020, IEEE WINT CONF APPL, P863, DOI [10.1109/wacv45572.2020.9093610, 10.1109/WACV45572.2020.9093610]

← 1 2 3 4 5 6 7 →