Towards learning a semantic-consistent subspace for cross-modal retrieval

被引：5

作者：

Xu, Meixiang ^{[1
,2
]}

Zhu, Zhenfeng ^{[1
,2
]}

Zhao, Yao ^{[1
,2
]}

机构：

[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China

[2] Beijing Key Lab Adv Informat Sci & Network Techno, Beijing 100044, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2019年 / 78卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Cross-modal; Semantic-correlation; Subspace learning; Multi-label;

D O I：

10.1007/s11042-018-6578-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A great many of approaches have been developed for cross-modal retrieval, among which subspace learning based ones dominate the landscape. Concerning whether using the semantic label information or not, subspace learning based approaches can be categorized into two paradigms, unsupervised and supervised. However, for multi-label cross-modal retrieval, supervised approaches just simply exploit multi-label information towards a discriminative subspace, without considering the correlations between multiple labels shared by multi-modalities, which often leads to an unsatisfactory retrieval performance. To address this issue, in this paper we propose a general framework, which jointly incorporates semantic correlations into subspace learning for multi-label cross-modal retrieval. By introducing the HSIC-based regularization term, the correlation information among multiple labels can be not only leveraged but also the consistency between the modality similarity from each modality is well preserved. Besides, based on the semantic-consistency projection, the semantic gap between the low-level feature space of each modality and the shared high-level semantic space can be balanced by a mid-level consistent one, where multi-label cross-modal retrieval can be performed effectively and efficiently. To solve the optimization problem, an effective iterative algorithm is designed, along with its convergence analysis theoretically and experimentally. Experimental results on real-world datasets have shown the superiority of the proposed method over several existing cross-modal subspace learning methods.

引用

页码：389 / 412

页数：24

共 50 条

[21] Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval
Xu, Xing
Song, Jingkuan
Lu, Huimin
Yang, Yang
Shen, Fumin
Huang, Zi
ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 46 - 54
[22] Deep Semantic Mapping for Cross-Modal Retrieval
Wang, Cheng
Yang, Haojin
Meinel, Christoph
2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
[23] Semantic-enhanced discriminative embedding learning for cross-modal retrieval
Hao Pan
Jun Huang
International Journal of Multimedia Information Retrieval, 2022, 11 : 369 - 382
[24] Semantic-enhanced discriminative embedding learning for cross-modal retrieval
Pan, Hao
Huang, Jun
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (03) : 369 - 382
[25] Semantic consistency hashing for cross-modal retrieval
Yao, Tao
Kong, Xiangwei
Fu, Haiyan
Tian, Qi
NEUROCOMPUTING, 2016, 193 : 250 - 259
[26] Analyzing semantic correlation for cross-modal retrieval
Liang Xie
Peng Pan
Yansheng Lu
Multimedia Systems, 2015, 21 : 525 - 539
[27] Analyzing semantic correlation for cross-modal retrieval
Xie, Liang
Pan, Peng
Lu, Yansheng
MULTIMEDIA SYSTEMS, 2015, 21 (06) : 525 - 539
[28] Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval
Wu, Yiling
Wang, Shuhui
Huang, Qingming
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 825 - 833
[29] Collaborative Subspace Graph Hashing for Cross-modal Retrieval
Zhang, Xiang
Dong, Guohua
Du, Yimo
Wu, Chengkun
Luo, Zhigang
Yang, Canqun
ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 213 - 221
[30] An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval
Zeng, Zhixiong
Xu, Nan
Mao, Wenji
Zeng, Daniel
IEEE INTELLIGENT SYSTEMS, 2022, 37 (03) : 45 - 53

← 1 2 3 4 5 →