A Cross-modal image retrieval method based on contrastive learning

被引：0

作者：

Zhou, Wen ^{[1
]}

机构：

[1] Wuhan Vocat Coll Software & Engn, Wuhan 430205, Peoples R China

来源：

JOURNAL OF OPTICS-INDIA | 2023年 / 53卷 / 3期

关键词：

Cross-modal; Contrastive learning; Picture and text retrieval;

D O I：

10.1007/s12596-023-01382-9

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

With the growth of large-scale image surveillance and video data in the field of public security, image retrieval has extremely important application value. In order to solve the problem that it is difficult to distinguish different instances in the image and the lack of visible objects in the image-level retrieval, we apply a new method, which is a cross-modal instance-level retrieval method based on text retrieval of images or videos. Natural language-based image search poses a new challenge to the fine-granularity understanding of visual and linguistic patterns. To bridge language and vision, we propose a cross-modal image retrieval method, based on contrastive learning, which consists of two parts: multi-granularity cross-modal contrastive learning and multi-modal fusion module based on similarity matrix and encoder-decoder captioning. In our experiments on open-source data sets, our method outperforms several SOTA cross-modality baselines, and demonstrate the effectiveness by ablation experiments.

引用

页码：2098 / 2107

页数：10

共 50 条

[31] Cross-Modal Contrastive Learning for Code Search
Shi, Zejian
Xiong, Yun
Zhang, Xiaolong
Zhang, Yao
Li, Shanshan
Zhu, Yangyong
2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2022), 2022, : 94 - 105
[32] Adversarial cross-modal retrieval based on dictionary learning
Shang, Fei
Zhang, Huaxiang
Zhu, Lei
Sun, Jiande
NEUROCOMPUTING, 2019, 355 : 93 - 104
[33] Adaptive Adversarial Learning based cross-modal retrieval
Li, Zhuoyi
Lu, Huibin
Fu, Hao
Wang, Zhongrui
Gu, Guanghun
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
[34] Semantic supervised learning based Cross-Modal Retrieval
Li, Zhuoyi
Fu, Hao
Gu, Guanghua
PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024, 2024, : 207 - 209
[35] Cross-Modal Contrastive Learning With Spatiotemporal Context for Correlation-Aware Multiscale Remote Sensing Image Retrieval
Zhu, Lilu
Wang, Yang
Hu, Yanfeng
Su, Xiaolu
Fu, Kun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[36] Cross-modal Contrastive Learning for Speech Translation
Ye, Rong
Wang, Mingxuan
Li, Lei
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5099 - 5113
[37] Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning
Lin, Ming-Xian
Yang, Jie
Wang, He
Lai, Yu-Kun
Jia, Rongfei
Zhao, Binqiang
Gao, Lin
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11385 - 11395
[38] Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation
Cao, Xianshuai
Shi, Yuliang
Wang, Jihu
Yu, Han
Wang, Xinjun
Yan, Zhongmin
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3694 - 3702
[39] Continual learning in cross-modal retrieval
Wang, Kai
Herranz, Luis
van de Weijer, Joost
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633
[40] Learning DALTS for cross-modal retrieval
Yu, Zheng
Wang, Wenmin
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (01) : 9 - 16

← 1 2 3 4 5 →