Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval

被引:0
|
作者
Zheng, Yuanchao [1 ]
Zhang, Xiaowei [1 ]
机构
[1] Qingdao Univ, Qingdao, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Cross-modal hashing; Heterogeneous interactive; Adversarial loss;
D O I
10.1007/978-3-031-26316-3_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal hashing has received a lot of attention because of its unique characteristic of low storage cost and high retrieval efficiency. However, these existing cross-modal retrieval approaches often fail to align effectively semantic information due to information asymmetry between image and text modality. To address this issue, we propose Heterogeneous Interactive Learning Network (HILN) for unsupervised cross-modal retrieval to alleviate the problem of the heterogeneous semantic gap. Specifically, we introduce a multi-head self-attention mechanism to capture the global dependencies of semantic features within the modality. Moreover, since the semantic relations among object entities from different modalities exist consistency, we perform heterogeneous feature fusion through the heterogeneous feature interaction module, especially through the cross attention in it to learn the interaction between different modal features. Finally, to further maintain semantic consistency, we introduce adversarial loss into network learning to generate more robust hash codes. Extensive experiments demonstrate that the proposed HILN improves the accuracy of T -> I and I -> T cross-modal retrieval tasks by 7.6% and 5.5% over the best competitor DGCPN on the NUS-WIDE dataset, respectively. Code is available at https://github.com/Z000204/HILN.
引用
收藏
页码:692 / 707
页数:16
相关论文
共 50 条
  • [1] UNSUPERVISED CROSS-MODAL RETRIEVAL THROUGH ADVERSARIAL LEARNING
    He, Li
    Xu, Xing
    Lu, Huimin
    Yang, Yang
    Shen, Fumin
    Shen, Heng Tao
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1153 - 1158
  • [2] Heterogeneous Metric Learning for Cross-Modal Multimedia Retrieval
    Deng, Jun
    Du, Liang
    Shen, Yi-Dong
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT I, 2013, 8180 : 43 - 56
  • [3] Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval
    Li, Chao
    Deng, Cheng
    Wang, Lei
    Xie, De
    Liu, Xianglong
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 176 - 183
  • [4] Learning sufficient scene representation for unsupervised cross-modal retrieval
    Luo, Jieting
    Wo, Yan
    Wu, Bicheng
    Han, Guoqiang
    NEUROCOMPUTING, 2021, 461 (461) : 404 - 418
  • [5] Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval
    Yu, Tan
    Yang, Yi
    Li, Yi
    Liu, Lin
    Fei, Hongliang
    Li, Ping
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1146 - 1156
  • [6] Semantic Preservation and Hash Fusion Network for Unsupervised Cross-Modal Retrieval
    Shu, Xinsheng
    Li, Mingyong
    WEB AND BIG DATA, APWEB-WAIM 2024, PT V, 2024, 14965 : 146 - 161
  • [7] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
    Yu, Jun
    Wu, Xiao-Jun
    Zhang, Donglin
    COGNITIVE COMPUTATION, 2022, 14 (03) : 1159 - 1171
  • [8] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
    Jun Yu
    Xiao-Jun Wu
    Donglin Zhang
    Cognitive Computation, 2022, 14 : 1159 - 1171
  • [9] HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval
    Zhang, Chengyuan
    Song, Jiayu
    Zhu, Xiaofeng
    Zhu, Lei
    Zhang, Shichao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [10] Heterogeneous memory enhanced graph reasoning network for cross-modal retrieval
    Zhong JI
    Kexin CHEN
    Yuqing HE
    Yanwei PANG
    Xuelong LI
    Science China(Information Sciences), 2022, 65 (07) : 157 - 169