HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval

被引:36
|
作者
Zhang, Chengyuan [1 ]
Song, Jiayu [2 ]
Zhu, Xiaofeng [3 ]
Zhu, Lei [4 ]
Zhang, Shichao [2 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Sichuan, Peoples R China
[4] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410128, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; deep learning; intra-modal semantic correlation; hybrid cross-modal similarity;
D O I
10.1145/3412847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of cross-modal retrieval is to find the relationship between different modal samples and to retrieve other modal samples with similar semantics by using a certain modal sample. As the data of different modalities presents heterogeneous low-level feature and semantic-related high-level features, the main problem of cross-modal retrieval is how to measure the similarity between different modalities. In this article, we present a novel cross-modal retrieval method, named Hybrid Cross-Modal Similarity Learning model (HCMSL for short). It aims to capture sufficient semantic information from both labeled and unlabeled cross-modal pairs and intra-modal pairs with same classification label. Specifically, a coupled deep fully connected networks are used to map cross-modal feature representations into a common subspace. Weight-sharing strategy is utilized between two branches of networks to diminish cross-modal heterogeneity. Furthermore, two Siamese CNN models are employed to learn intra-modal similarity from samples of same modality. Comprehensive experiments on real datasets clearly demonstrate that our proposed technique achieves substantial improvements over the state-of-the-art cross-modal retrieval techniques.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
    Zou, Fuhao
    Bai, Xingqiang
    Luan, Chaoyang
    Li, Kai
    Wang, Yunfei
    Ling, Hefei
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 825 - 841
  • [22] Cross-modal melodic contour similarity
    Prince, Jon B.
    Schmuckler, Mark A.
    Thompson, William Forde
    Canadian Acoustics - Acoustique Canadienne, 2009, 37 (01): : 35 - 49
  • [23] CHILDRENS RECOGNITION OF CROSS-MODAL SIMILARITY
    MARKS, LE
    BORNSTEIN, MH
    CAHIERS DE PSYCHOLOGIE COGNITIVE-CURRENT PSYCHOLOGY OF COGNITION, 1985, 5 (3-4): : 322 - 322
  • [24] Deep semantic similarity adversarial hashing for cross-modal retrieval
    Qiang, Haopeng
    Wan, Yuan
    Xiang, Lun
    Meng, Xiaojing
    NEUROCOMPUTING, 2020, 400 : 24 - 33
  • [25] Enhancing Cross-Modal Hash Retrieval with Multiple Similarity Matrices
    Li Z.
    Hou C.
    Xie X.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (06): : 933 - 945
  • [26] Revising similarity relationship hashing for unsupervised cross-modal retrieval
    Wu, You
    Li, Bo
    Li, Zhixin
    NEUROCOMPUTING, 2025, 614
  • [27] Deep supervised fused similarity hashing for cross-modal retrieval
    Ng W.W.Y.
    Xu Y.
    Tian X.
    Wang H.
    Multimedia Tools and Applications, 2024, 83 (39) : 86537 - 86555
  • [28] Similarity and diversity induced paired projection for cross-modal retrieval
    Li, Jinxing
    Li, Mu
    Lu, Guangming
    Zhang, Bob
    Yin, Hongpeng
    Zhang, David
    INFORMATION SCIENCES, 2020, 539 (539) : 215 - 228
  • [29] Deep Adversarial Learning Triplet Similarity Preserving Cross-Modal Retrieval Algorithm
    Li, Guokun
    Wang, Zhen
    Xu, Shibo
    Feng, Chuang
    Yang, Xiaohan
    Wu, Nannan
    Sun, Fuzhen
    MATHEMATICS, 2022, 10 (15)
  • [30] Infant cross-modal learning
    Chow, Hiu Mei
    Tsui, Angeline Sin-Mei
    Ma, Yuen Ki
    Yat, Mei Ying
    Tseng, Chia-huei
    I-PERCEPTION, 2014, 5 (04): : 463 - 463