Multi-hop Interactive Cross-Modal Retrieval

被引:0
|
作者
Ning, Xuecheng [1 ]
Yang, Xiaoshan [2 ,3 ,4 ]
Xu, Changsheng [1 ,2 ,3 ,4 ]
机构
[1] HeFei Univ Technol, Hefei, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Peng Cheng Lab, Shenzhen, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Deep learning; LSTMs;
D O I
10.1007/978-3-030-37734-2_55
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional representation learning based cross-modal retrieval approaches always represent the sentence with a global embedding feature, which easily neglects the local correlations between objects in the image and phrases in the sentence. In this paper, we present a novel Multi-hop Interactive Cross-modal Retrieval Model (MICRM), which interactively exploits the local correlations between images and words. We design a multi-hop interactive module to infer the high-order relevance between the image and the sentence. Experimental results on two benchmark datasets, MS-COCO and Flickr30K, demonstrate that our multi-hop interactive model performs significantly better than several competitive cross-modal retrieval methods.
引用
收藏
页码:681 / 693
页数:13
相关论文
共 50 条
  • [21] Multi-hop Evidence Retrieval for Cross-document Relation Extraction
    Lu, Keming
    Hsu, I-Hung
    Zhou, Wenxuan
    Ma, Mingyu Derek
    Chen, Muhao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 10336 - 10351
  • [22] Soft Contrastive Cross-Modal Retrieval
    Song, Jiayu
    Hu, Yuxuan
    Zhu, Lei
    Zhang, Chengyuan
    Zhang, Jian
    Zhang, Shichao
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [23] Probabilistic Embeddings for Cross-Modal Retrieval
    Chun, Sanghyuk
    Oh, Seong Joon
    de Rezende, Rafael Sampaio
    Kalantidis, Yannis
    Larlus, Diane
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8411 - 8420
  • [24] Cross-modal Retrieval with Correspondence Autoencoder
    Feng, Fangxiang
    Wang, Xiaojie
    Li, Ruifan
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 7 - 16
  • [25] Cross-modal retrieval with dual optimization
    Xu, Qingzhen
    Liu, Shuang
    Qiao, Han
    Li, Miao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7141 - 7157
  • [26] Geometric Matching for Cross-Modal Retrieval
    Wang, Zheng
    Gao, Zhenwei
    Yang, Yang
    Wang, Guoqing
    Jiao, Chengbo
    Shen, Heng Tao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
  • [27] CROSS-MODAL RETRIEVAL WITH NOISY LABELS
    Mandal, Devraj
    Biswas, Soma
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2326 - 2330
  • [28] Cross-Modal Retrieval for CPSS Data
    Zhong, Fangming
    Wang, Guangze
    Chen, Zhikui
    Xia, Feng
    Min, Geyong
    IEEE ACCESS, 2020, 8 : 16689 - 16701
  • [29] Hashing for Cross-Modal Similarity Retrieval
    Liu, Yao
    Yuan, Yanhong
    Huang, Qiaoli
    Huang, Zhixing
    2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 1 - 8
  • [30] A Graph Model for Cross-modal Retrieval
    Wang, Shixun
    Pan, Peng
    Lu, Yansheng
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA TECHNOLOGY (ICMT-13), 2013, 84 : 1090 - 1097