Hierarchical Semantic Structure Preserving Hashing for Cross-Modal Retrieval

被引：19

作者：

Wang, Di ^{[1
,2
]}

Zhang, Caiping ^{[3
]}

Wang, Quan ^{[3
]}

Tian, Yumin ^{[3
]}

He, Lihuo ^{[4
]}

Zhao, Lin ^{[2
]}

机构：

[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China

[2] Nanjing Univ Sci & Technol, Jiangsu Key Lab Image & Video Understanding Social, Nanjing 210093, Jiangsu, Peoples R China

[3] Xidian Univ, Sch Comp Sci & Technol, Key Labratory Smart Human Comp Interact & Wearable, Xian 710071, Peoples R China

[4] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

基金：

中国国家自然科学基金;

关键词：

Semantics; Codes; Binary codes; Representation learning; Correlation; Hash functions; Feature extraction; Cross-modal retrieval; deep hashing; semantic preserving; hierarchical learning;

D O I：

10.1109/TMM.2022.3140656

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Cross-modal hashing has become a vital technique in cross-modal retrieval due to its fast query speed and low storage cost in recent years. Generally, most of the priors supervised cross-modal hashing methods are flat methods which are designed for non-hierarchical labeled data. They treat different categories independently and ignore the inter-category correlations. In practical applications, many instances are labeled with hierarchical categories. The hierarchical label structure provides rich information among different categories. To rationally take use of category correlations, hierarchical cross-modal hashing is proposed. However, existing methods intend to preserve instance-pairwise or class-pairwise similarities, which cannot fully explore the semantic correlations among different categories and make the learned hash codes less discriminative. In this paper, we propose a deep cross-modal hashing method named hierarchical semantic structure preserving hashing (HSSPH), which directly exploits the label hierarchy information to learn discriminative hash codes. Specifically, HSSPH learns a set of class-wise hash codes for each layer. By augmenting class-wise codes with labels, it generates layer-wise prototype codes which reflect the semantic structure of each layer. In order to enhance the discriminative ability of hash codes, HSSPH supervises the hash codes learning with both labels and semantic structures to preserve the hierarchical semantics. Besides, efficient optimization algorithms are developed to directly learn the discrete hash codes for each instance and each class. Extensive experiments on two benchmark datasets show the superiority of HSSPH over several state-of-the-art methods.

引用

页码：1217 / 1229

页数：13

共 37 条

[1] Cao Y, 2017, AAAI CONF ARTIF INTE, P3974
[2] Exposing Computer Generated Images by Eye's Region Classification via Transfer Learning of VGG19 CNN
Carvalho, Tiago
de Rezende, Edmar R. S.
Alves, Matheus T. P.
Balieiro, Fernanda K. C.
Sovat, Ricardo B.
[J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 866 - 870
[3] Chatfield K., 2014, PROC 25 BRIT MACH VI
[4] Cong Bai, 2020, ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval, P525, DOI 10.1145/3372278.3390711
[5] Unsupervised Semantic-Preserving Adversarial Hashing for Image Search
Deng, Cheng
Yang, Erkun
Liu, Tongliang
Li, Jie
Liu, Wei
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (08) : 4032 - 4044
[6] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
Deng, Cheng
Chen, Zhaojia
Liu, Xianglong
Gao, Xinbo
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
[7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8] Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing
Ding, Guiguang
Guo, Yuchen
Zhou, Jile
Gao, Yue
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) : 5427 - 5440
[9] Collective Matrix Factorization Hashing for Multimodal Data
Ding, Guiguang
Guo, Yuchen
Zhou, Jile
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2083 - 2090
[10] Cross-Modal Hashing via Rank-Order Preserving
Ding, Kun
Fan, Bin
Huo, Chunlei
Xiang, Shiming
Pan, Chunhong
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (03) : 571 - 585

← 1 2 3 4 →