Cross-Modal Deep Variational Hashing

被引:75
作者
Liong, Venice Erin [1 ,3 ]
Lu, Jiwen [2 ]
Tan, Yap-Peng [3 ]
Zhou, Jie [2 ]
机构
[1] Nanyang Technol Univ, Interdisciplinary Grad Sch, Rapid Rich Object Search ROSE Lab, Singapore, Singapore
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[3] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV.2017.439
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a cross-modal deep variational hashing (CMDVH) method for cross-modality multimedia retrieval. Unlike existing cross-modal hashing methods which learn a single pair of projections to map each example as a binary vector, we design a couple of deep neural network to learn non-linear transformations from image-text input pairs, so that unified binary codes can be obtained. We then design the modality-specific neural networks in a probabilistic manner where we model a latent variable as close as possible from the inferred binary codes, which is approximated by a posterior distribution regularized by a known prior. Experimental results on three benchmark datasets show the efficacy of the proposed approach.
引用
收藏
页码:4097 / 4105
页数:9
相关论文
共 38 条
[1]  
[Anonymous], 2013, ICML
[2]  
[Anonymous], 2010, P 18 ACM INT C MULT
[3]  
[Anonymous], 2016, BIOMED RES INT, DOI DOI 10.1109/EUCAP.2016.7481633
[4]  
[Anonymous], 2015, CVPR
[5]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[6]  
Cao Y., 2016, KDD, P1
[7]   Correlation Autoencoder Hashing for Supervised Cross-Modal Search [J].
Cao, Yue ;
Long, Mingsheng ;
Wang, Jianmin ;
Zhu, Han .
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, :197-204
[8]   The devil is in the details: an evaluation of recent feature encoding methods [J].
Chatfield, Ken ;
Lempitsky, Victor ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[9]   Collective Matrix Factorization Hashing for Multimodal Data [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090
[10]  
Glorot X., 2010, P 13 INT C ART INT S, P249, DOI DOI 10.1109/LGRS.2016.2565705