Generalized Semantic Preserving Hashing for n-Label Cross-Modal Retrieval

被引:92
作者
Mandal, Devraj [1 ]
Chaudhury, Kunal N. [1 ]
Biswas, Soma [1 ]
机构
[1] Indian Inst Sci, Bangalore 560012, Karnataka, India
来源
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年
关键词
D O I
10.1109/CVPR.2017.282
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to availability of large amounts of multimedia data, cross-modal matching is gaining increasing importance. Hashing based techniques provide an attractive solution to this problem when the data size is large. Different scenarios of cross-modal matching are possible, for example, data from the different modalities can be associated with a single label or multiple labels, and in addition may or may not have one-to-one correspondence. Most of the existing approaches have been developed for the case where there is one-to-one correspondence between the data of the two modalities. In this paper, we propose a simple, yet effective generalized hashing framework which can work for all the different scenarios, while preserving the semantic distance between the data points. The approach first learns the optimum hash codes for the two modalities simultaneously, so as to preserve the semantic similarity between the data points, and then learns the hash functions to map from the features to the hash codes. Extensive experiments on single label dataset like Wiki and multi-label datasets like NUS-WIDE, Pascal and LabelMe under all the different scenarios and comparisons with the state-of-the-art shows the effectiveness of the proposed approach.
引用
收藏
页码:2633 / 2641
页数:9
相关论文
共 29 条
[1]  
[Anonymous], 2010, P 18 ACM INT C MULT
[2]  
[Anonymous], IJCAI
[3]  
[Anonymous], 2005, minfunc: unconstrained differentiable multivariate optimization in matlab
[4]  
[Anonymous], 2015, Convex Optimization Algorithms
[5]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[6]  
Chua T.-S., 2009, P ACM INT C IM VID R, P1
[7]   Collective Matrix Factorization Hashing for Multimodal Data [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090
[8]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[9]   A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics [J].
Gong, Yunchao ;
Ke, Qifa ;
Isard, Michael ;
Lazebnik, Svetlana .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 106 (02) :210-233
[10]   Canonical correlation analysis: An overview with application to learning methods [J].
Hardoon, DR ;
Szedmak, S ;
Shawe-Taylor, J .
NEURAL COMPUTATION, 2004, 16 (12) :2639-2664