Cross-Modal Deep Variational Hashing

被引：75

作者：

Liong, Venice Erin ^{[1
,3
]}

Lu, Jiwen ^{[2
]}

Tan, Yap-Peng ^{[3
]}

Zhou, Jie ^{[2
]}

机构：

[1] Nanyang Technol Univ, Interdisciplinary Grad Sch, Rapid Rich Object Search ROSE Lab, Singapore, Singapore

[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China

[3] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICCV.2017.439

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a cross-modal deep variational hashing (CMDVH) method for cross-modality multimedia retrieval. Unlike existing cross-modal hashing methods which learn a single pair of projections to map each example as a binary vector, we design a couple of deep neural network to learn non-linear transformations from image-text input pairs, so that unified binary codes can be obtained. We then design the modality-specific neural networks in a probabilistic manner where we model a latent variable as close as possible from the inferred binary codes, which is approximated by a posterior distribution regularized by a known prior. Experimental results on three benchmark datasets show the efficacy of the proposed approach.

引用

页码：4097 / 4105

页数：9

共 38 条

[1]

[Anonymous], 2013, ICML

[2]

[Anonymous], 2010, P 18 ACM INT C MULT

[3]

[Anonymous], 2016, BIOMED RES INT, DOI DOI 10.1109/EUCAP.2016.7481633

[4]

[Anonymous], 2015, CVPR

[5]

Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928

[6]

Cao Y., 2016, KDD, P1

[7] Correlation Autoencoder Hashing for Supervised Cross-Modal Search [J].

Cao, Yue ;

Long, Mingsheng ;

Wang, Jianmin ;

Zhu, Han .

ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, :197-204

[8] The devil is in the details: an evaluation of recent feature encoding methods [J].

Chatfield, Ken ;

Lempitsky, Victor ;

Vedaldi, Andrea ;

Zisserman, Andrew .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,

[9] Collective Matrix Factorization Hashing for Multimodal Data [J].

Ding, Guiguang ;

Guo, Yuchen ;

Zhou, Jile .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090

[10]

Glorot X., 2010, P 13 INT C ART INT S, P249, DOI DOI 10.1109/LGRS.2016.2565705

← 1 2 3 4 →