Deep Semantic Space with Intra-class Low-rank Constraint for Cross-modal Retrieval

被引：9

作者：

Kang, Peipei ^{[1
]}

Lin, Zehang ^{[1
]}

Yang, Zhenguo ^{[1
,2
]}

Fang, Xiaozhao ^{[3
]}

Li, Qing ^{[4
]}

Liu, Wenyin ^{[1
]}

机构：

[1] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou, Guangdong, Peoples R China

[2] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

[3] Guangdong Univ Technol, Dept Automat, Guangzhou, Guangdong, Peoples R China

[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

来源：

ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL | 2019年

基金：

中国国家自然科学基金;

关键词：

cross-modal retrieval; deep neural networks; intra-class low-rank; semantic space;

D O I：

10.1145/3323873.3325029

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this paper, a novel Deep Semantic Space learning model with Intra-class Low-rank constraint (DSSIL) is proposed for cross-modal retrieval, which is composed of two subnetworks for modality-specific representation learning, followed by projection layers for common space mapping. In particular, DSSIL takes into account semantic consistency to fuse the cross-modal data in a high-level common space, and constrains the common representation matrix within the same class to be low-rank, in order to induce the intra-class representations more relevant. More formally, two regularization terms are devised for the two aspects, which have been incorporated into the objective of DSSIL. To optimize the modality-specific subnetworks and the projection layers simultaneously by exploiting the gradient decent directly, we approximate the nonconvex low-rank constraint by minimizing a few smallest singular values of the intra-class matrix with theoretical analysis. Extensive experiments conducted on three public datasets demonstrate the competitive superiority of DSSIL for cross-modal retrieval compared with the state-of-the-art methods.

引用

页码：226 / 234

页数：9

共 50 条

[31] Multispectral Foreground Detection via Robust Cross-Modal Low-Rank Decomposition
Zheng, Aihua
Zhao, Yumiao
Li, Chenglong
Tang, Jin
Luo, Bin
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 819 - 829
[32] Multi-modal semantic autoencoder for cross-modal retrieval
Wu, Yiling
Wang, Shuhui
Huang, Qingming
NEUROCOMPUTING, 2019, 331 : 165 - 175
[33] Latent Space Semantic Supervision Based on Knowledge Distillation for Cross-Modal Retrieval
Zhang, Li
Wu, Xiangqian
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7154 - 7164
[34] Learning Shared Semantic Space with Correlation Alignment for Cross-Modal Event Retrieval
Yang, Zhenguo
Lin, Zehang
Kang, Peipei
Lv, Jianming
Li, Qing
Liu, Wenyin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (01)
[35] Multilevel Deep Semantic Feature Asymmetric Network for Cross-Modal Hashing Retrieval
Jiang, Xiaolong
Fan, Jiabao
Zhang, Jie
Lin, Ziyong
Li, Mingyong
IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (08) : 621 - 631
[36] Multi-attention based semantic deep hashing for cross-modal retrieval
Zhu, Liping
Tian, Gangyi
Wang, Bingyao
Wang, Wenjie
Zhang, Di
Li, Chengyang
APPLIED INTELLIGENCE, 2021, 51 (08) : 5927 - 5939
[37] Cross-Modal Event Retrieval: A Dataset and a Baseline Using Deep Semantic Learning
Situ, Runwei
Yang, Zhenguo
Lv, Jianming
Li, Qing
Liu, Wenyin
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 147 - 157
[38] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
Cheng Zhang
Yuan Wan
Haopeng Qiang
Neural Computing and Applications, 2024, 36 : 5383 - 5397
[39] Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
Cheng, Shuli
Wang, Liejun
Du, Anyu
ENTROPY, 2020, 22 (11) : 1 - 22
[40] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
Zhang, Cheng
Wan, Yuan
Qiang, Haopeng
NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5383 - 5397

← 1 2 3 4 5 →