Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval

被引:209
|
作者
Su, Shupeng [1 ]
Zhong, Zhisheng [1 ]
Zhang, Chao [1 ]
机构
[1] Peking Univ, Sch EECS, Key Lab Machine Percept MOE, Beijing, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
10.1109/ICCV.2019.00312
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal hashing encodes the multimedia data into a common binary hash space in which the correlations among the samples from different modalities can be effectively measured. Deep cross-modal hashing further improves the retrieval performance as the deep neural networks can generate more semantic relevant features and hash codes. In this paper, we study the unsupervised deep cross-modal hash coding and propose Deep Joint-Semantics Reconstructing Hashing (DJSRH), which has the following two main advantages. First, to learn binary codes that preserve the neighborhood structure of the original data, DJSRH constructs a novel joint-semantics affinity matrix which elaborately integrates the original neighborhood information from different modalities and accordingly is capable to capture the latent intrinsic semantic affinity for the input multi-modal instances. Second, DJSRH later trains the networks to generate binary codes that maximally reconstruct above joint-semantics relations via the proposed reconstructing framework, which is more competent for the batch-wise training as it reconstructs the specific similarity value unlike the common Laplacian constraint merely preserving the similarity order. Extensive experiments demonstrate the significant improvement by DJSRH in various cross-modal retrieval tasks.
引用
收藏
页码:3027 / 3035
页数:9
相关论文
共 50 条
  • [1] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
    Li, Mingyong
    Wang, Hongya
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
  • [2] Joint-modal Distribution-based Similarity Hashing for Large-scale Unsupervised Deep Cross-modal Retrieval
    Liu, Song
    Qian, Shengsheng
    Guan, Yang
    Zhan, Jiawei
    Ying, Long
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1379 - 1388
  • [3] CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval
    Mingyong, Li
    Yewen, Li
    Mingyuan, Ge
    Longfei, Ma
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2023, 12 (01)
  • [4] CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval
    Li Mingyong
    Li Yewen
    Ge Mingyuan
    Ma Longfei
    International Journal of Multimedia Information Retrieval, 2023, 12
  • [5] Semantics-Reconstructing Hashing for Cross-Modal Retrieval
    Zhang, Peng-Fei
    Huang, Zi
    Zhang, Zheng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 315 - 327
  • [6] Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval
    Wu, Gengshen
    Lin, Zijia
    Han, Jungong
    Liu, Li
    Ding, Guiguang
    Zhang, Baochang
    Shen, Jialie
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2854 - 2860
  • [7] Unsupervised Joint-Semantics Autoencoder Hashing for Multimedia Retrieval
    Chen, Yunfei
    Long, Jun
    Li, Yinan
    Wu, Yanrui
    Yang, Zhan
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT V, 2024, 14451 : 318 - 330
  • [8] Joint and individual matrix factorization hashing for large-scale cross-modal retrieval
    Wang, Di
    Wang, Quan
    He, Lihuo
    Gao, Xinbo
    Tian, Yumin
    PATTERN RECOGNITION, 2020, 107
  • [9] Unsupervised multi-graph cross-modal hashing for large-scale multimedia retrieval
    Liang Xie
    Lei Zhu
    Guoqi Chen
    Multimedia Tools and Applications, 2016, 75 : 9185 - 9204
  • [10] Unsupervised multi-graph cross-modal hashing for large-scale multimedia retrieval
    Xie, Liang
    Zhu, Lei
    Chen, Guoqi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 9185 - 9204