SCH-GAN: Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network

被引:94
作者
Zhang, Jian [1 ]
Peng, Yuxin [1 ]
Yuan, Mingkuan [1 ]
机构
[1] Peking Univ, Inst Comp Sci & Technol, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Data models; Correlation; Generative adversarial networks; Training data; Predictive models; Gallium nitride; Cross-modal hashing; generative adversarial network (GAN); semi-supervised; CODES;
D O I
10.1109/TCYB.2018.2868826
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-modal hashing maps heterogeneous multimedia data into a common Hamming space to realize fast and flexible cross-modal retrieval. Supervised cross-modal hashing methods have achieved considerable progress by incorporating semantic side information. However, they heavily rely on large-scale labeled cross-modal training data which are hard to obtain, since multiple modalities are involved. They also ignore the rich information contained in the large amount of unlabeled data across different modalities, which can help to model the correlations between different modalities. To address these problems, in this paper, we propose a novel semi-supervised cross-modal hashing approach by generative adversarial network (SCH-GAN). The main contributions can be summarized as follows: 1) we propose a novel generative adversarial network for cross-modal hashing, in which the generative model tries to select margin examples of one modality from unlabeled data when given a query of another modality (e.g., giving a text query to retrieve images and vice versa). The discriminative model tries to distinguish the selected examples and true positive examples of the query. These two models play a minimax game so that the generative model can promote the hashing performance of the discriminative model and 2) we propose a reinforcement learning-based algorithm to drive the training of proposed SCH-GAN. The generative model takes the correlation score predicted by discriminative model as a reward, and tries to select the examples close to the margin to promote a discriminative model. Extensive experiments verify the effectiveness of our proposed approach, compared with nine state-of-the-art methods on three widely used datasets.
引用
收藏
页码:489 / 502
页数:14
相关论文
共 54 条
[1]  
[Anonymous], 2016, ARXIV161009585
[2]  
[Anonymous], 2016, CONDITIONAL IMAGE SY
[3]  
[Anonymous], USING GENERATIVE MOD
[4]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[5]  
Cao Y, 2017, AAAI CONF ARTIF INTE, P3974
[6]   Deep Visual-Semantic Hashing for Cross-Modal Retrieval [J].
Cao, Yue ;
Long, Mingsheng ;
Wang, Jianmin ;
Yang, Qiang ;
Yu, Philip S. .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :1445-1454
[7]   Correlation Autoencoder Hashing for Supervised Cross-Modal Search [J].
Cao, Yue ;
Long, Mingsheng ;
Wang, Jianmin ;
Zhu, Han .
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, :197-204
[8]   Spectral Embedded Hashing for Scalable Image Retrieval [J].
Chen, Lin ;
Xu, Dong ;
Tsang, Ivor Wai-Hung ;
Li, Xuelong .
IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (07) :1180-1190
[9]  
Chen X, 2016, ADV NEUR IN, V29
[10]  
Chua T. -S., 2009, P ACM INT C IM VID R, V1, P9