Learning Multipart Attention Neural Network for Zero-Shot Classification

被引：9

作者：

Meng, Min ^{[1
]}

Wei, Jie ^{[1
]}

Wu, Jigang ^{[1
]}

机构：

[1] Guangdong Univ Technol, Dept Comp Sci, Guangzhou 510006, Peoples R China

来源：

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS | 2022年 / 14卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Semantics; Visualization; Neural networks; Training; Image recognition; Prototypes; Feature extraction; Attention mechanism; part annotations; visual recognition; zero-shot learning (ZSL);

D O I：

10.1109/TCDS.2020.3044313

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Zero-shot learning (ZSL) models typically learn a cross-modal mapping between the visual feature space and the semantic embedding space. Despite promising performance achieved by existing methods, they usually take visual features from the whole image as the main proposed inputs, while pay little attention to image regions which are relevant to human's visual response to the whole image. In this article, we propose a neural network-based ZSL model which incorporates an attention mechanism to discover the discriminative parts for each image. The proposed model allows us to automatically generate attention maps for visual parts, which provides a flexible way of encoding the salient visual aspects to distinguish the categories. Moreover, we introduce a simple yet effective objective function to exploit the pairwise label information between images and classes, resulting in substantial performance improvement. When multiple semantic spaces are available, a multiple-attention scheme is provided to fuse different semantic spaces, which helps to achieve further improvement in performance. On the widely used CUB-2010-2011 data set for fine-grained image classification, we demonstrate the advantages of using attention mechanism and semantic parts in our model for ZSL. Comprehensive experimental results show that our proposed approach achieves superior performance than the state-of-the-art methods.

引用

页码：414 / 423

页数：10

共 51 条

[1] Multi-Cue Zero-Shot Learning with Strong Supervision [J].

Akata, Zeynep ;

Malinowski, Mateusz ;

Fritz, Mario ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :59-68

[2]

Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911

[3] Label-Embedding for Attribute-Based Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826

[4] Preserving Semantic Relations for Zero-Shot Learning [J].

Annadani, Yashas ;

Biswas, Soma .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612

[5] Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions [J].

Ba, Jimmy Lei ;

Swersky, Kevin ;

Fidler, Sanja ;

Salakhutdinov, Ruslan .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4247-4255

[6] Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication [J].

Bucher, Maxime ;

Herbin, Stephane ;

Jurie, Frederic .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :730-746

[7] Synthesized Classifiers for Zero-Shot Learning [J].

Changpinyo, Soravit ;

Chao, Wei-Lun ;

Gong, Boqing ;

Sha, Fei .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336

[8] Attention to Scale: Scale-aware Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Yang, Yi ;

Wang, Jiang ;

Xu, Wei ;

Yuille, Alan L. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649

[9] Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks [J].

Chen, Long ;

Zhang, Hanwang ;

Xiao, Jun ;

Liu, Wei ;

Chang, Shih-Fu .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1043-1052

[10] Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision [J].

Elhoseiny, Mohamed ;

Zhu, Yizhe ;

Zhang, Han ;

Elgammal, Ahmed .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6288-6297

← 1 2 3 4 5 6 →