Label Consistent Flexible Matrix Factorization Hashing for Efficient Cross-modal Retrieval

被引:36
作者
Zhang, Donglin [1 ]
Wu, Xiao-Jun [1 ]
Yu, Jun [1 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, 1800 Lihu Ave, Wuxi 214122, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Hashing; cross-modal retrieval; flexible matrix factorization;
D O I
10.1145/3446774
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hashing methods have sparked a great revolution on large-scale cross-media search due to its effectiveness and efficiency. Most existing approaches learn unified hash representation in a common Hamming space to represent all multimodal data. However, the unified hash codes may not characterize the cross-modal data discriminatively, because the data may vary greatly due to its different dimensionalities, physical properties, and statistical information. In addition, most existing supervised cross-modal algorithms preserve the similarity relationship by constructing an n x n pairwise similarity matrix, which requires a large amount of calculation and loses the category information. To mitigate these issues, a novel cross-media hashing approach is proposed in this article, dubbed label flexible matrix factorization hashing (LFMH). Specifically, LFMH jointly learns the modality-specific latent subspace with similar semantic by the flexible matrix factorization. In addition, LFMH guides the hash learning by utilizing the semantic labels directly instead of the large n x n pairwise similarity matrix. LFMH transforms the heterogeneous data into modality-specific latent semantic representation. Therefore, we can obtain the hash codes by quantifying the representations, and the learned hash codes are consistent with the supervised labels of multimodal data. Then, we can obtain the similar binary codes of the corresponding modality, and the binary codes can characterize such samples flexibly. Accordingly, the derived hash codes have more discriminative power for single-modal and cross-modal retrieval tasks. Extensive experiments on eight different databases demonstrate that our model outperforms some competitive approaches.
引用
收藏
页数:18
相关论文
共 45 条
  • [1] Baeza-Yates R., 1999, Modern information retrieval, V463
  • [2] Deep Visual-Semantic Hashing for Cross-Modal Retrieval
    Cao, Yue
    Long, Mingsheng
    Wang, Jianmin
    Yang, Qiang
    Yu, Philip S.
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1445 - 1454
  • [3] Chua TS, 2009, P ACM INT C IM VID R, P1, DOI 10.1145/1646396.1646452
  • [4] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [5] Collective Matrix Factorization Hashing for Multimodal Data
    Ding, Guiguang
    Guo, Yuchen
    Zhou, Jile
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2083 - 2090
  • [6] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [7] FEIWANG PC, 2012, INFORM RETRIEVAL, V15, P179
  • [8] A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics
    Gong, Yunchao
    Ke, Qifa
    Isard, Michael
    Lazebnik, Svetlana
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 106 (02) : 210 - 233
  • [9] Collective Reconstructive Embeddings for Cross-Modal Hashing
    Hu, Mengqiu
    Yang, Yang
    Shen, Fumin
    Xie, Ning
    Hong, Richang
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) : 2770 - 2784
  • [10] Huiskes MJ, 2008, P 1 ACM INT C MULT I, P39, DOI [10.1145/1460096.1460104, DOI 10.1145/1460096.1460104]