A Cross-modal image retrieval method based on contrastive learning

被引：0

作者：

Zhou, Wen ^{[1
]}

机构：

[1] Wuhan Vocat Coll Software & Engn, Wuhan 430205, Peoples R China

来源：

JOURNAL OF OPTICS-INDIA | 2023年 / 53卷 / 3期

关键词：

Cross-modal; Contrastive learning; Picture and text retrieval;

D O I：

10.1007/s12596-023-01382-9

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

With the growth of large-scale image surveillance and video data in the field of public security, image retrieval has extremely important application value. In order to solve the problem that it is difficult to distinguish different instances in the image and the lack of visible objects in the image-level retrieval, we apply a new method, which is a cross-modal instance-level retrieval method based on text retrieval of images or videos. Natural language-based image search poses a new challenge to the fine-granularity understanding of visual and linguistic patterns. To bridge language and vision, we propose a cross-modal image retrieval method, based on contrastive learning, which consists of two parts: multi-granularity cross-modal contrastive learning and multi-modal fusion module based on similarity matrix and encoder-decoder captioning. In our experiments on open-source data sets, our method outperforms several SOTA cross-modality baselines, and demonstrate the effectiveness by ablation experiments.

引用

页码：2098 / 2107

页数：10

共 50 条

[21] Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval
Xu, Mengying
Luo, Linyin
Lai, Hanjiang
Yin, Jian
DATA SCIENCE AND ENGINEERING, 2024, 9 (03) : 251 - 263
[22] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
Fuhao Zou
Xingqiang Bai
Chaoyang Luan
Kai Li
Yunfei Wang
Hefei Ling
World Wide Web, 2019, 22 : 825 - 841
[23] Image-text bidirectional learning network based cross-modal retrieval
Li, Zhuoyi
Lu, Huibin
Fu, Hao
Gu, Guanghua
NEUROCOMPUTING, 2022, 483 : 148 - 159
[24] Intramodal consistency in triplet-based cross-modal learning for image retrieval
Mallea, Mario
Nanculef, Ricardo
Araya, Mauricio
MACHINE LEARNING, 2025, 114 (04)
[25] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
Zou, Fuhao
Bai, Xingqiang
Luan, Chaoyang
Li, Kai
Wang, Yunfei
Ling, Hefei
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 825 - 841
[26] Cross-modal Retrieval Using Contrastive Learning of Visual-Semantic Embeddings
Jain, Anurag
Verma, Yashaswi
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4693 - 4699
[27] Cross-modal Image-Text Retrieval with Multitask Learning
Luo, Junyu
Shen, Ying
Ao, Xiang
Zhao, Zhou
Yang, Min
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2309 - 2312
[28] Cross-modal contrastive learning for aspect-based recommendation
Won, Heesoo
Oh, Byungkook
Yang, Hyeongjun
Lee, Kyong-Ho
INFORMATION FUSION, 2023, 99
[29] A Cross-Modal Image and Text Retrieval Method Based on Efficient Feature Extraction and Interactive Learning CAE
Yin, Xiuye
Chen, Liyong
SCIENTIFIC PROGRAMMING, 2022, 2022
[30] HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval
Zhang, Chengyuan
Song, Jiayu
Zhu, Xiaofeng
Zhu, Lei
Zhang, Shichao
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)

← 1 2 3 4 5 →