Hierarchical Semantic Structure Preserving Hashing for Cross-Modal Retrieval

被引:19
作者
Wang, Di [1 ,2 ]
Zhang, Caiping [3 ]
Wang, Quan [3 ]
Tian, Yumin [3 ]
He, Lihuo [4 ]
Zhao, Lin [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
[2] Nanjing Univ Sci & Technol, Jiangsu Key Lab Image & Video Understanding Social, Nanjing 210093, Jiangsu, Peoples R China
[3] Xidian Univ, Sch Comp Sci & Technol, Key Labratory Smart Human Comp Interact & Wearable, Xian 710071, Peoples R China
[4] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Codes; Binary codes; Representation learning; Correlation; Hash functions; Feature extraction; Cross-modal retrieval; deep hashing; semantic preserving; hierarchical learning;
D O I
10.1109/TMM.2022.3140656
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-modal hashing has become a vital technique in cross-modal retrieval due to its fast query speed and low storage cost in recent years. Generally, most of the priors supervised cross-modal hashing methods are flat methods which are designed for non-hierarchical labeled data. They treat different categories independently and ignore the inter-category correlations. In practical applications, many instances are labeled with hierarchical categories. The hierarchical label structure provides rich information among different categories. To rationally take use of category correlations, hierarchical cross-modal hashing is proposed. However, existing methods intend to preserve instance-pairwise or class-pairwise similarities, which cannot fully explore the semantic correlations among different categories and make the learned hash codes less discriminative. In this paper, we propose a deep cross-modal hashing method named hierarchical semantic structure preserving hashing (HSSPH), which directly exploits the label hierarchy information to learn discriminative hash codes. Specifically, HSSPH learns a set of class-wise hash codes for each layer. By augmenting class-wise codes with labels, it generates layer-wise prototype codes which reflect the semantic structure of each layer. In order to enhance the discriminative ability of hash codes, HSSPH supervises the hash codes learning with both labels and semantic structures to preserve the hierarchical semantics. Besides, efficient optimization algorithms are developed to directly learn the discrete hash codes for each instance and each class. Extensive experiments on two benchmark datasets show the superiority of HSSPH over several state-of-the-art methods.
引用
收藏
页码:1217 / 1229
页数:13
相关论文
共 37 条
  • [1] Cao Y, 2017, AAAI CONF ARTIF INTE, P3974
  • [2] Exposing Computer Generated Images by Eye's Region Classification via Transfer Learning of VGG19 CNN
    Carvalho, Tiago
    de Rezende, Edmar R. S.
    Alves, Matheus T. P.
    Balieiro, Fernanda K. C.
    Sovat, Ricardo B.
    [J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 866 - 870
  • [3] Chatfield K., 2014, PROC 25 BRIT MACH VI
  • [4] Cong Bai, 2020, ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval, P525, DOI 10.1145/3372278.3390711
  • [5] Unsupervised Semantic-Preserving Adversarial Hashing for Image Search
    Deng, Cheng
    Yang, Erkun
    Liu, Tongliang
    Li, Jie
    Liu, Wei
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (08) : 4032 - 4044
  • [6] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
    Deng, Cheng
    Chen, Zhaojia
    Liu, Xianglong
    Gao, Xinbo
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing
    Ding, Guiguang
    Guo, Yuchen
    Zhou, Jile
    Gao, Yue
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) : 5427 - 5440
  • [9] Collective Matrix Factorization Hashing for Multimodal Data
    Ding, Guiguang
    Guo, Yuchen
    Zhou, Jile
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2083 - 2090
  • [10] Cross-Modal Hashing via Rank-Order Preserving
    Ding, Kun
    Fan, Bin
    Huo, Chunlei
    Xiang, Shiming
    Pan, Chunhong
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (03) : 571 - 585