MSSPQ: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval

被引：8

作者：

Zhu, Lei ^{[1
]}

Cai, Liewu ^{[1
]}

Song, Jiayu ^{[2
]}

Zhu, Xinghui ^{[1
]}

Zhang, Chengyuan ^{[1
]}

Zhang, Shichao ^{[2
]}

机构：

[1] Hunan Agr Univ, Coll Informat & Intelligence, Changsha, Hunan, Peoples R China

[2] Cent South Univ, Sch Comp Sci & Engn, Changsha, Hunan, Peoples R China

来源：

PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

Cross-modal hashing; Semantic correlation; Quantization; Deep learning; NETWORK;

D O I：

10.1145/3512527.3531417

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Cross-modal hashing is a hot issue in the multimedia community, which is to generate compact hash code from multimedia content for efficient cross-modal search. Two challenges, i.e., (1) How to efficiently enhance cross-modal semantic mining is essential for cross-modal hash code learning, and (2) How to combine multiple semantic correlations learning to improve the semantic similarity preserving, cannot be ignored. To this end, this paper proposed a novel end-to-end cross-modal hashing approach, named Multiple Semantic Structure-Preserving Quantization (MSSPQ) that is to integrate deep hashing model with multiple semantic correlation learning to boost hash learning performance. The multiple semantic correlation learning consists of inter-modal and intra-modal pairwise correlation learning and Cosine correlation learning, which can comprehensively capture cross-modal consistent semantics and realize semantic similarity preserving. Extensive experiments are conducted on three multimedia datasets, which confirms that the proposed method outperforms the baselines.

引用

页码：631 / 638

页数：8

共 51 条

[1] [Anonymous], 2016, P 25 INT JOINT C ART
[2] Latent Dirichlet allocation
Blei, DM
Ng, AY
Jordan, MI
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
[3] Cao Y, 2017, Arxiv, DOI arXiv:1602.06697
[4] Cross-Modal Hamming Hashing
Cao, Yue
Liu, Bin
Long, Mingsheng
Wang, Jianmin
[J]. COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 207 - 223
[5] Bayesian feature interaction selection for factorization machines
Chen, Yifan
Wang, Yang
Ren, Pengjie
Wang, Meng
de Rijke, Maarten
[J]. ARTIFICIAL INTELLIGENCE, 2022, 302
[6] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
Deng, Cheng
Chen, Zhaojia
Liu, Xianglong
Gao, Xinbo
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
[7] Detecting Disaster-Related Tweets Via Multimodal Adversarial Neural Network
Gao, Wang
Zhu, Xun
Wang, Yuwei
Li, Lin
[J]. IEEE MULTIMEDIA, 2020, 27 (04) : 28 - 37
[8] A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics
Gong, Yunchao
Ke, Qifa
Isard, Michael
Lazebnik, Svetlana
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 106 (02) : 210 - 233
[9] Guo CF, 2021, Arxiv, DOI arXiv:1907.01693
[10] Han J., 2020, IEEE transactions on cybernetics, P1

← 1 2 3 4 5 6 →