A Cross-modal image retrieval method based on contrastive learning

被引:0
|
作者
Zhou, Wen [1 ]
机构
[1] Wuhan Vocat Coll Software & Engn, Wuhan 430205, Peoples R China
来源
JOURNAL OF OPTICS-INDIA | 2023年 / 53卷 / 3期
关键词
Cross-modal; Contrastive learning; Picture and text retrieval;
D O I
10.1007/s12596-023-01382-9
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
With the growth of large-scale image surveillance and video data in the field of public security, image retrieval has extremely important application value. In order to solve the problem that it is difficult to distinguish different instances in the image and the lack of visible objects in the image-level retrieval, we apply a new method, which is a cross-modal instance-level retrieval method based on text retrieval of images or videos. Natural language-based image search poses a new challenge to the fine-granularity understanding of visual and linguistic patterns. To bridge language and vision, we propose a cross-modal image retrieval method, based on contrastive learning, which consists of two parts: multi-granularity cross-modal contrastive learning and multi-modal fusion module based on similarity matrix and encoder-decoder captioning. In our experiments on open-source data sets, our method outperforms several SOTA cross-modality baselines, and demonstrate the effectiveness by ablation experiments.
引用
收藏
页码:2098 / 2107
页数:10
相关论文
共 50 条
  • [1] A Cross-modal image retrieval method based on contrastive learning
    Zhou, Wen
    JOURNAL OF OPTICS-INDIA, 2024, 53 (03): : 2098 - 2107
  • [2] Cross-modal Contrastive Learning for Generalizable and Efficient Image-text Retrieval
    Haoyu Lu
    Yuqi Huo
    Mingyu Ding
    Nanyi Fei
    Zhiwu Lu
    Machine Intelligence Research, 2023, 20 : 569 - 582
  • [3] Cross-modal Contrastive Learning for Generalizable and Efficient Image-text Retrieval
    Lu, Haoyu
    Huo, Yuqi
    Ding, Mingyu
    Fei, Nanyi
    Lu, Zhiwu
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (04) : 569 - 582
  • [4] TRAJCROSS: Trajecotry Cross-Modal Retrieval with Contrastive Learning
    Jing, Quanliang
    Yao, Di
    Gong, Chang
    Fan, Xinxin
    Wang, Baoli
    Tan, Haining
    Bi, Jingping
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 344 - 349
  • [5] Soft Contrastive Cross-Modal Retrieval
    Song, Jiayu
    Hu, Yuxuan
    Zhu, Lei
    Zhang, Chengyuan
    Zhang, Jian
    Zhang, Shichao
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [6] Momentum Cross-Modal Contrastive Learning for Video Moment Retrieval
    Han, De
    Cheng, Xing
    Guo, Nan
    Ye, Xiaochun
    Rainer, Benjamin
    Priller, Peter
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5977 - 5994
  • [7] Improving text-image cross-modal retrieval with contrastive loss
    Zhang, Chumeng
    Yang, Yue
    Guo, Junbo
    Jin, Guoqing
    Song, Dan
    Liu, An An
    MULTIMEDIA SYSTEMS, 2023, 29 (02) : 569 - 575
  • [8] Image-Text Cross-Modal Retrieval with Instance Contrastive Embedding
    Zeng, Ruigeng
    Ma, Wentao
    Wu, Xiaoqian
    Liu, Wei
    Liu, Jie
    ELECTRONICS, 2024, 13 (02)
  • [9] Improving text-image cross-modal retrieval with contrastive loss
    Chumeng Zhang
    Yue Yang
    Junbo Guo
    Guoqing Jin
    Dan Song
    An An Liu
    Multimedia Systems, 2023, 29 : 569 - 575
  • [10] Cross-Modal Attention Preservation with Self-Contrastive Learning for Composed Query-Based Image Retrieval
    Li, Shenshen
    Xu, Xing
    Jiang, Xun
    Shen, Fumin
    Sun, Zhe
    Cichocki, Andrzej
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (06)