Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval

被引:0
|
作者
Zheng, Yuanchao [1 ]
Zhang, Xiaowei [1 ]
机构
[1] Qingdao Univ, Qingdao, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Cross-modal hashing; Heterogeneous interactive; Adversarial loss;
D O I
10.1007/978-3-031-26316-3_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal hashing has received a lot of attention because of its unique characteristic of low storage cost and high retrieval efficiency. However, these existing cross-modal retrieval approaches often fail to align effectively semantic information due to information asymmetry between image and text modality. To address this issue, we propose Heterogeneous Interactive Learning Network (HILN) for unsupervised cross-modal retrieval to alleviate the problem of the heterogeneous semantic gap. Specifically, we introduce a multi-head self-attention mechanism to capture the global dependencies of semantic features within the modality. Moreover, since the semantic relations among object entities from different modalities exist consistency, we perform heterogeneous feature fusion through the heterogeneous feature interaction module, especially through the cross attention in it to learn the interaction between different modal features. Finally, to further maintain semantic consistency, we introduce adversarial loss into network learning to generate more robust hash codes. Extensive experiments demonstrate that the proposed HILN improves the accuracy of T -> I and I -> T cross-modal retrieval tasks by 7.6% and 5.5% over the best competitor DGCPN on the NUS-WIDE dataset, respectively. Code is available at https://github.com/Z000204/HILN.
引用
收藏
页码:692 / 707
页数:16
相关论文
共 50 条
  • [31] Learning Cross-Modal Retrieval with Noisy Labels
    Hu, Peng
    Peng, Xi
    Zhu, Hongyuan
    Zhen, Liangli
    Lin, Jie
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5399 - 5409
  • [32] Joint-Modal Graph Convolutional Hashing for unsupervised cross-modal retrieval
    Meng, Hui
    Zhang, Huaxiang
    Liu, Li
    Liu, Dongmei
    Lu, Xu
    Guo, Xinru
    NEUROCOMPUTING, 2024, 595
  • [33] Hybrid representation learning for cross-modal retrieval
    Cao, Wenming
    Lin, Qiubin
    He, Zhihai
    He, Zhiquan
    NEUROCOMPUTING, 2019, 345 : 45 - 57
  • [34] Multimodal Graph Learning for Cross-Modal Retrieval
    Xie, Jingyou
    Zhao, Zishuo
    Lin, Zhenzhou
    Shen, Ying
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 145 - 153
  • [35] Federated learning for supervised cross-modal retrieval
    Li, Ang
    Li, Yawen
    Shao, Yingxia
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (04):
  • [36] Dual variational network for unsupervised cross-modal hashing
    Deng, Xuran
    Liu, Zhihang
    Li, Pandeng
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [37] Image-text bidirectional learning network based cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Gu, Guanghua
    NEUROCOMPUTING, 2022, 483 : 148 - 159
  • [38] Cross-modal hash retrieval based on semantic multiple similarity learning and interactive projection matrix learning
    Tan, Junpeng
    Yang, Zhijing
    Ye, Jielin
    Chen, Ruihan
    Cheng, Yongqiang
    Qin, Jinghui
    Chen, Yongfeng
    INFORMATION SCIENCES, 2023, 648
  • [39] Dark knowledge association guided hashing for unsupervised cross-modal retrieval
    Kang, Han
    Zhang, Xiaowei
    Han, Wenpeng
    Zhou, Mingliang
    MULTIMEDIA SYSTEMS, 2024, 30 (06)
  • [40] Unsupervised Cross-modal Hash Retrieval Fusing Multiple Instance Relations
    Li Z.-X.
    Hou C.-W.
    Xie X.-M.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (11): : 4973 - 4988