Semantic-consistent cross-modal hashing for large-scale image retrieval

被引:11
作者
Gu, Xuesong [1 ]
Dong, Guohua [2 ,3 ]
Zhang, Xiang [2 ,3 ]
Lan, Long [2 ,3 ]
Luo, Zhigang [2 ,3 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Sci & Technol Parallel & Distributed Proc, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Comp, Inst Quantum Informat, Changsha 410073, Hunan, Peoples R China
[3] Natl Univ Def Technol, Coll Comp, State Key Lab High Performance Comp, Changsha 410073, Hunan, Peoples R China
基金
美国国家科学基金会;
关键词
Cross-modal hashing; Cross-modal retrieval; Supervised learning;
D O I
10.1016/j.neucom.2020.11.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With an emphasis on saving storage and computation costs, hashing learning has got considerable success in cross-modal image retrieval. Most pioneer efforts in this direction either consider similarity across modalities or leverage discriminative information of the class labels to learn the common latent representation. However, the learnt representation only contains coherent semantics across modalities but could not be line with class-wise semantic structure. To attack this issue, we propose a semantic consistent cross-modal hashing (SCCH) to take class semantic structure into consideration. It not only ensures the common representation to be consistent across modalities by directly learning the shared binary codes of samples via rotation transformation, but also restricts the class-wise representation line with the learnt binary codes. In this way, SCCH jointly preserves class semantic structure and avoids large quantization errors caused by the approximation of real values to binary codes. Moreover, we efficiently optimize SCCH via an iterative algorithm. Experiments on three publicly datasets demonstrate the superiority of SCCH against several representative start-of-the-art counterparts in light of performance metrics. (c) 2020 Published by Elsevier B.V.
引用
收藏
页码:181 / 198
页数:18
相关论文
共 57 条
[1]  
Andoni A, 2006, ANN IEEE SYMP FOUND, P459
[2]  
[Anonymous], 2009, NEURAL INFORM PROCES
[3]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[4]   Deep Visual-Semantic Hashing for Cross-Modal Retrieval [J].
Cao, Yue ;
Long, Mingsheng ;
Wang, Jianmin ;
Yang, Qiang ;
Yu, Philip S. .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :1445-1454
[5]  
Chen TY, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2109
[6]   A Two-Step Cross-Modal Hashing by Exploiting Label Correlations and Preserving Similarity in Both Steps [J].
Chen, Zhen-Duo ;
Wang, Yongxin ;
Li, Hui-Qiong ;
Luo, Xin ;
Nie, Liqiang ;
Xu, Xin-Shun .
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :1694-1702
[7]  
Chua T.-S., 2009, Proceedings of the ACM international conference on image and video retrieval, P1
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]   Collective Matrix Factorization Hashing for Multimodal Data [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090
[10]   Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval [J].
Gong, Yunchao ;
Lazebnik, Svetlana ;
Gordo, Albert ;
Perronnin, Florent .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (12) :2916-2929