Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search

被引:162
作者
Wang, Di [1 ]
Gao, Xinbo [2 ]
Wang, Xiumei [2 ]
He, Lihuo [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, 2 South Taibai Rd, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Video & Image Proc Syst VIPs Lab, Sch Elect Engn, 2 South Taibai Rd, Xian 710071, Shaanxi, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Hashing; multimodal; supervised; similarity search; cross-modal; QUANTIZATION;
D O I
10.1109/TPAMI.2018.2861000
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal hashing has attracted much interest for cross-modal similarity search on large-scale multimedia data sets because of its efficiency and effectiveness. Recently, supervised multimodal hashing, which tries to preserve the semantic information obtained from the labels of training data, has received considerable attention for its higher search accuracy compared with unsupervised multimodal hashing. Although these algorithms are promising, they are mainly designed to preserve pairwise similarities. When semantic labels of training data are given, the algorithms often transform the labels into pairwise similarities, which gives rise to the following problems: (1) constructing pairwise similarity matrix requires enormous storage space and a large amount of calculation, making these methods unscalable to large-scale data sets; (2) transforming labels into pairwise similarities loses the category information of the training data. Therefore, these methods do not enable the hash codes to preserve the discriminative information reflected by labels and, hence, the retrieval accuracies of these methods are affected. To address these challenges, this paper introduces a simple yet effective supervised multimodal hashing method, called label consistent matrix factorization hashing (LCMFH), which focuses on directly utilizing semantic labels to guide the hashing learning procedure. Considering that relevant data from different modalities have semantic correlations, LCMFH transforms heterogeneous data into latent semantic spaces in which multimodal data from the same category share the same representation. Therefore, hash codes quantified by the obtained representations are consistent with the semantic labels of the original data and, thus, can have more discriminative power for cross-modal similarity search tasks. Thorough experiments on standard databases show that the proposed algorithm outperforms several state-of-the-art methods.
引用
收藏
页码:2466 / 2479
页数:14
相关论文
共 36 条
[1]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[2]  
Caicedo JuanC., 2012, Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, P56
[3]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[4]   Triplet-Based Deep Hashing Network for Cross-Modal Retrieval [J].
Deng, Cheng ;
Chen, Zhaojia ;
Liu, Xianglong ;
Gao, Xinbo ;
Tao, Dacheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) :3893-3903
[5]   Collective Matrix Factorization Hashing for Multimodal Data [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090
[6]   Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval [J].
Gong, Yunchao ;
Lazebnik, Svetlana ;
Gordo, Albert ;
Perronnin, Florent .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (12) :2916-2929
[7]   K-means Hashing: an Affinity-Preserving Quantization Method for Learning Binary Compact Codes [J].
He, Kaiming ;
Wen, Fang ;
Sun, Jian .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2938-2945
[8]   Alternating Co-Quantization for Cross-modal Hashing [J].
Irie, Go ;
Arai, Hiroyuki ;
Taniguchi, Yukinobu .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1886-1894
[9]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[10]   Kernelized Locality-Sensitive Hashing for Scalable Image Search [J].
Kulis, Brian ;
Grauman, Kristen .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :2130-2137