Deep semantic similarity adversarial hashing for cross-modal retrieval

被引:13
作者
Qiang, Haopeng [1 ]
Wan, Yuan [1 ]
Xiang, Lun [2 ]
Meng, Xiaojing [1 ]
机构
[1] Wuhan Univ Technol, Sch Sci, Math Dept, Wuhan 430070, Peoples R China
[2] Wuhan Univ Technol, Sch Sci, Stat Dept, Wuhan 430070, Peoples R China
关键词
Hashing learning; Semantic similarity; Adversarial learning; Cross-Modal retrieval; REPRESENTATION;
D O I
10.1016/j.neucom.2020.03.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal retrieval has attracted considerate attention due to the rapid development of Internet and social media, and cross-modal hashing has been widely and successfully used in this domain. However, most existing hashing methods consider much little the semantic similarity levels between instances, whereas simply classified the semantic relationship as either similar or dissimilar. Besides, the issue of preservation of semantic similarity of original data between the extracted features is less explored from the existing methods. Due to the heterogeneity between different modalities, similarity of different modality features cannot be calculated directly. Therefore, in this paper, we propose a deep semantic similarity adversarial hashing (DSSAH) for cross-modal retrieval. We first calculate semantic similarity by using both label and feature information to provide a more accurate value for similarity between instances. And then an adversarial modality discriminator is introduced to establish a common feature space where similarity of each modality features can be calculated. Finally, two loss functions referring to inter-modal loss and intra-modal loss are designed to generate high quality hash codes. Experiments on three common datasets for cross-modal retrieval show that DSSAH outperforms state-of-the-art cross-modal hashing methods in cross-modal retrieval applications. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:24 / 33
页数:10
相关论文
共 53 条
  • [1] Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
    Andoni, Alexandr
    Indyk, Piotr
    [J]. COMMUNICATIONS OF THE ACM, 2008, 51 (01) : 117 - 122
  • [2] [Anonymous], 2014, ARXIV14082927
  • [3] Arjovsky M., 2017, ARXIV170107875
  • [4] Cao Y, 2017, AAAI CONF ARTIF INTE, P3974
  • [5] Deep Visual-Semantic Hashing for Cross-Modal Retrieval
    Cao, Yue
    Long, Mingsheng
    Wang, Jianmin
    Yang, Qiang
    Yu, Philip S.
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1445 - 1454
  • [6] The devil is in the details: an evaluation of recent feature encoding methods
    Chatfield, Ken
    Lempitsky, Victor
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [7] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
    Choi, Yunjey
    Choi, Minje
    Kim, Munyoung
    Ha, Jung-Woo
    Kim, Sunghun
    Choo, Jaegul
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8789 - 8797
  • [8] Chua T.-S., 2009, P ACM INT C IM VID R, P48
  • [9] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
    Deng, Cheng
    Chen, Zhaojia
    Liu, Xianglong
    Gao, Xinbo
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
  • [10] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848