Multi-hop Interactive Cross-Modal Retrieval

被引:0
|
作者
Ning, Xuecheng [1 ]
Yang, Xiaoshan [2 ,3 ,4 ]
Xu, Changsheng [1 ,2 ,3 ,4 ]
机构
[1] HeFei Univ Technol, Hefei, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Peng Cheng Lab, Shenzhen, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Deep learning; LSTMs;
D O I
10.1007/978-3-030-37734-2_55
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional representation learning based cross-modal retrieval approaches always represent the sentence with a global embedding feature, which easily neglects the local correlations between objects in the image and phrases in the sentence. In this paper, we present a novel Multi-hop Interactive Cross-modal Retrieval Model (MICRM), which interactively exploits the local correlations between images and words. We design a multi-hop interactive module to infer the high-order relevance between the image and the sentence. Experimental results on two benchmark datasets, MS-COCO and Flickr30K, demonstrate that our multi-hop interactive model performs significantly better than several competitive cross-modal retrieval methods.
引用
收藏
页码:681 / 693
页数:13
相关论文
共 50 条
  • [41] Multi-modal Subspace Learning with Dropout regularization for Cross-modal Recognition and Retrieval
    Cao, Guanqun
    Waris, Muhammad Adeel
    Iosifidis, Alexandros
    Gabbouj, Moncef
    2016 SIXTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2016,
  • [42] Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval
    Ma, Xinhong
    Zhang, Tianzhu
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3101 - 3114
  • [43] Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval
    Ji, Zhenyan
    Yao, Weina
    Wei, Wei
    Song, Houbing
    Pi, Huaiyu
    IEEE ACCESS, 2019, 7 : 23667 - 23674
  • [44] METRIC BASED ON MULTI-ORDER SPACES FOR CROSS-MODAL RETRIEVAL
    Zhang, Liang
    Ma, Bingpeng
    Li, Guorong
    Huang, Qingming
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1374 - 1379
  • [45] MVItem: A Benchmark for Multi-View Cross-Modal Item Retrieval
    Li, Bo
    Zhu, Jiansheng
    Dai, Linlin
    Jing, Hui
    Huang, Zhizheng
    Sui, Yuteng
    IEEE ACCESS, 2024, 12 : 119563 - 119576
  • [46] Graph Convolutional Multi-Label Hashing for Cross-Modal Retrieval
    Shen, Xiaobo
    Chen, Yinfan
    Liu, Weiwei
    Zheng, Yuhui
    Sun, Quan-Sen
    Pan, Shirui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [47] Multi-class joint subspace learning for cross-modal retrieval
    Yu, En
    Li, Jing
    Wang, Li
    Zhang, Jia
    Wan, Wenbo
    Sun, Jiande
    PATTERN RECOGNITION LETTERS, 2020, 130 (130) : 165 - 173
  • [48] MULTI-LEVEL CONTRASTIVE LEARNING FOR HYBRID CROSS-MODAL RETRIEVAL
    Zhao, Yiming
    Lu, Haoyu
    Zhao, Shiqi
    Wu, Haoran
    Lu, Zhiwu
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6390 - 6394
  • [49] Hypergraph clustering based multi-label cross-modal retrieval
    Guo, Shengtang
    Zhang, Huaxiang
    Liu, Li
    Liu, Dongmei
    Lu, Xu
    Li, Liujian
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
  • [50] A Framework for Enabling Unpaired Multi-Modal Learning for Deep Cross-Modal Hashing Retrieval
    Williams-Lekuona, Mikel
    Cosma, Georgina
    Phillips, Iain
    JOURNAL OF IMAGING, 2022, 8 (12)