Region Graph Embedding Network for Zero-Shot Learning

被引：135

作者：

Xie, Guo-Sen ^{[1
]}

Liu, Li ^{[1
]}

Zhu, Fan ^{[1
]}

Zhao, Fang ^{[1
]}

Zhang, Zheng ^{[2
,3
]}

Yao, Yazhou ^{[5
]}

Qin, Jie ^{[1
]}

Shao, Ling ^{[1
,4
]}

机构：

[1] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates

[2] Harbin Inst Technol, Shenzhen, Peoples R China

[3] Peng Cheng Lab, Shenzhen, Peoples R China

[4] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates

[5] Nanjing Univ Sci & Technol, Nanjing, Peoples R China

来源：

COMPUTER VISION - ECCV 2020, PT IV | 2020年 / 12349卷

基金：

中国国家自然科学基金;

关键词：

Zero-shot learning; Parts relation reasoning; Balance loss;

D O I：

10.1007/978-3-030-58548-8_33

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most of the existing Zero-Shot Learning (ZSL) approaches learn direct embeddings from global features or image parts (regions) to the semantic space, which, however, fail to capture the appearance relationships between different local regions within a single image. In this paper, to model the relations among local image regions, we incorporate the region-based relation reasoning into ZSL. Our method, termed as Region Graph Embedding Network (RGEN), is trained end-to-end from raw image data. Specifically, RGEN consists of two branches: the Constrained Part Attention (CPA) branch and the Parts Relation Reasoning (PRR) branch. CPA branch is built upon attention and produces the image regions. To exploit the progressive interactions among these regions, we represent them as a region graph, on which the parts relation reasoning is performed with graph convolutions, thus leading to our PRR branch. To train our model, we introduce both a transfer loss and a balance loss to contrast class similarities and pursue the maximum response consistency among seen and unseen outputs, respectively. Extensive experiments on four datasets well validate the effectiveness of the proposed method under both ZSL and generalized ZSL settings.

引用

页码：562 / 580

页数：19

共 80 条

[1] Multi-Cue Zero-Shot Learning with Strong Supervision [J].

Akata, Zeynep ;

Malinowski, Mateusz ;

Fritz, Mario ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :59-68

[2] Label-Embedding for Image Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) :1425-1438

[3]

Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911

[4] Label-Embedding for Attribute-Based Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826

[5] Preserving Semantic Relations for Zero-Shot Learning [J].

Annadani, Yashas ;

Biswas, Soma .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612

[6]

[Anonymous], 2018, Advances in Neural Information Processing Systems

[7]

[Anonymous], 2009, P INT C NEUR INF PRO

[8] Synthesized Classifiers for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Gong, Boqing ;

Sha, Fei .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336

[9] An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild [J].

Chao, Wei-Lun ;

Changpinyo, Soravit ;

Gong, Boqing ;

Sha, Fei .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :52-68

[10] MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features [J].

Chen, Liang-Chieh ;

Hermans, Alexander ;

Papandreou, George ;

Schroff, Florian ;

Wang, Peng ;

Adam, Hartwig .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4013-4022

← 1 2 3 4 5 6 7 8 →