Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval

被引:0
|
作者
Zheng, Yuanchao [1 ]
Zhang, Xiaowei [1 ]
机构
[1] Qingdao Univ, Qingdao, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Cross-modal hashing; Heterogeneous interactive; Adversarial loss;
D O I
10.1007/978-3-031-26316-3_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal hashing has received a lot of attention because of its unique characteristic of low storage cost and high retrieval efficiency. However, these existing cross-modal retrieval approaches often fail to align effectively semantic information due to information asymmetry between image and text modality. To address this issue, we propose Heterogeneous Interactive Learning Network (HILN) for unsupervised cross-modal retrieval to alleviate the problem of the heterogeneous semantic gap. Specifically, we introduce a multi-head self-attention mechanism to capture the global dependencies of semantic features within the modality. Moreover, since the semantic relations among object entities from different modalities exist consistency, we perform heterogeneous feature fusion through the heterogeneous feature interaction module, especially through the cross attention in it to learn the interaction between different modal features. Finally, to further maintain semantic consistency, we introduce adversarial loss into network learning to generate more robust hash codes. Extensive experiments demonstrate that the proposed HILN improves the accuracy of T -> I and I -> T cross-modal retrieval tasks by 7.6% and 5.5% over the best competitor DGCPN on the NUS-WIDE dataset, respectively. Code is available at https://github.com/Z000204/HILN.
引用
收藏
页码:692 / 707
页数:16
相关论文
共 50 条
  • [21] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
    Li, Mingyong
    Wang, Hongya
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
  • [22] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
    Qin, Xueyang
    Li, Lishuang
    Pang, Guangyao
    Hao, Fei
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [23] Query Aware Dual Contrastive Learning Network for Cross-modal Retrieval
    Yin M.-R.
    Liang M.-Y.
    Yu Y.
    Cao X.-W.
    Du J.-P.
    Xue Z.
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (05): : 2120 - 2132
  • [24] Deep Unsupervised Momentum Contrastive Hashing for Cross-modal Retrieval
    Lu, Kangkang
    Yu, Yanhua
    Liang, Meiyu
    Zhang, Min
    Cao, Xiaowen
    Zhao, Zehua
    Yin, Mengran
    Xue, Zhe
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 126 - 131
  • [25] Multimodal adversarial network for cross-modal retrieval
    Hu, Peng
    Peng, Dezhong
    Wang, Xu
    Xiang, Yong
    KNOWLEDGE-BASED SYSTEMS, 2019, 180 : 38 - 50
  • [26] Deep Memory Network for Cross-Modal Retrieval
    Song, Ge
    Wang, Dong
    Tan, Xiaoyang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (05) : 1261 - 1275
  • [27] UNSUPERVISED CONTRASTIVE HASHING FOR CROSS-MODAL RETRIEVAL IN REMOTE SENSING
    Mikriukov, Georgii
    Ravanbakhsh, Mahdyar
    Demir, Begum
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4463 - 4467
  • [28] Unsupervised Deep Imputed Hashing for Partial Cross-modal Retrieval
    Chen, Dong
    Cheng, Miaomiao
    Min, Chen
    Jing, Liping
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [29] Revising similarity relationship hashing for unsupervised cross-modal retrieval
    Wu, You
    Li, Bo
    Li, Zhixin
    NEUROCOMPUTING, 2025, 614
  • [30] Cross-Modal Retrieval Using Deep Learning
    Malik, Shaily
    Bhardwaj, Nikhil
    Bhardwaj, Rahul
    Kumar, Saurabh
    PROCEEDINGS OF THIRD DOCTORAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, DOSCI 2022, 2023, 479 : 725 - 734