Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval

被引:42
作者
Chen, Binghui [1 ,2 ]
Deng, Weihong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100193, Peoples R China
[2] AI Labs, Didi Chuxing, Beijing 100193, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR.2019.00286
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In zero-shot image retrieval (ZSIR) task, embedding learning becomes more attractive, however, many methods follow the traditional metric learning idea and omit the problems behind zero-shot settings. In this paper, we first emphasize the importance of learning visual discriminative metric and preventing the partial/selective learning behavior of learner in ZSIR, and then propose the Decoupled Metric Learning (DeML) framework to achieve these individually. Instead of coarsely optimizing an unified metric, we decouple it into multiple attention-specific parts so as to recurrently induce the discrimination and explicitly enhance the generalization. And they are mainly achieved by our object-attention module based on random walk graph propagation and the channel-attention module based on the adversary constraint, respectively. We demonstrate the necessity of addressing the vital problems in ZSIR on the popular benchmarks, outperforming the state-of-the-art methods by a significant margin
引用
收藏
页码:2745 / 2754
页数:10
相关论文
共 44 条
[1]  
[Anonymous], 2017, ARXIV170401285
[2]  
[Anonymous], 2018, ARXIV180600974
[3]   Synthesized Classifiers for Zero-Shot Learning [J].
Changpinyo, Soravit ;
Chao, Wei-Lun ;
Gong, Boqing ;
Sha, Fei .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5327-5336
[4]  
Chen BH, 2019, AAAI CONF ARTIF INTE, P8134
[5]   SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Nie, Liqiang ;
Shao, Jian ;
Liu, Wei ;
Chua, Tat-Seng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306
[6]  
Chung Joon Son, 2016, ARXIV161105358, P2
[7]   Zero-Shot Video Retrieval Using Content and Concepts [J].
Dalton, Jeffrey ;
Allan, James ;
Mirajkar, Pranav .
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, :1857-1860
[8]  
Florian, 2015, P IEEE C COMP VIS PA, P815
[9]   Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].
Fu, Jianlong ;
Zheng, Heliang ;
Mei, Tao .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484
[10]   Transductive Multi-View Zero-Shot Learning [J].
Fu, Yanwei ;
Hospedales, Timothy M. ;
Xiang, Tao ;
Gong, Shaogang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (11) :2332-2345