Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval

被引:393
作者
Li, Chao [1 ]
Deng, Cheng [1 ]
Li, Ning [1 ]
Liu, Wei [2 ]
Gao, Xinbo [1 ]
Tao, Dacheng [3 ]
机构
[1] Xidian Univ, Sch Elect Engn, Xian 710071, Shaanxi, Peoples R China
[2] Tencent AI Lab, Shenzhen, Peoples R China
[3] Univ Sydney, UBTECH Sydney AI Ctr, SIT, FEIT, Sydney, NSW, Australia
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR.2018.00446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
UThanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (SSAH) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal hashing in a self-supervised fashion. The primary contribution of this work is that two adversarial networks are leveraged to maximize the semantic correlation and consistency of the representations between different modalities. In addition, we harness a self-supervised semantic network to discover high-level semantic information in the form of multi-label annotations. Such information guides the feature learning process and preserves the modality relationships in both the common semantic space and the Hamming space. Extensive experiments carried out on three benchmark datasets validate that the proposed SSAH surpasses the state-of-the-art methods.
引用
收藏
页码:4242 / 4251
页数:10
相关论文
共 46 条
  • [31] Multimodal Similarity-Preserving Hashing
    Masci, Jonathan
    Bronstein, Michael M.
    Bronstein, Alexander M.
    Schmidhuber, Juergen
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (04) : 824 - 830
  • [32] ImageNet Large Scale Visual Recognition Challenge
    Russakovsky, Olga
    Deng, Jia
    Su, Hao
    Krause, Jonathan
    Satheesh, Sanjeev
    Ma, Sean
    Huang, Zhiheng
    Karpathy, Andrej
    Khosla, Aditya
    Bernstein, Michael
    Berg, Alexander C.
    Fei-Fei, Li
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) : 211 - 252
  • [33] Learning Binary Codes for Maximum Inner Product Search
    Shen, Fumin
    Liu, Wei
    Zhang, Shaoting
    Yang, Yang
    Shen, Heng Tao
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4148 - 4156
  • [34] Simonyan K., 2015, P 3 INT C LEARNING R
  • [35] Top Rank Supervised Binary Coding for Visual Search
    Song, Dongjin
    Liu, Wei
    Ji, Rongrong
    Meyer, David A.
    Smith, John R.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1922 - 1930
  • [36] Song J., 2013, PROC ACM SIGMOD IN, P785
  • [37] A survey of multi-view machine learning
    Sun, Shiliang
    [J]. NEURAL COMPUTING & APPLICATIONS, 2013, 23 (7-8) : 2031 - 2038
  • [38] Deep Learning for Content-Based Image Retrieval: A Comprehensive Study
    Wan, Ji
    Wang, Dayong
    Hoi, Steven C. H.
    Wu, Pengcheng
    Zhu, Jianke
    Zhang, Yongdong
    Li, Jintao
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 157 - 166
  • [39] Adversarial Cross-Modal Retrieval
    Wang, Bokun
    Yang, Yang
    Xu, Xing
    Hanjalic, Alan
    Shen, Heng Tao
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 154 - 162
  • [40] Wang D, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3890