Learning Cross-Modal Retrieval with Noisy Labels

被引：73

作者：

Hu, Peng ^{[1
,2
]}

Peng, Xi ^{[1
]}

Zhu, Hongyuan ^{[2
]}

Zhen, Liangli ^{[3
]}

Lin, Jie ^{[2
]}

机构：

[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China

[2] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore, Singapore

[3] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

国家重点研发计划;

关键词：

HASHING NETWORK;

D O I：

10.1109/CVPR46437.2021.00536

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently; cross-modal retrieval is emerging with the help of deep multimodal learning. However, even for unimodal data, collecting large-scale well-annotated data is expensive and time-consuming, and not to mention the additional challenges from multiple modalities. Although crowd-sourcing annotation, e.g., Amazon's Mechanical Turk, can be utilized to mitigate the labeling cost, but leading to the unavoidable noise in labels for the non-expert annotating. To tackle the challenge, this paper presents a general Multimodal Robust Learning framework (MRL) for learning with multimodal noisy labels to mitigate noisy samples and correlate distinct modalities simultaneously. To be specific, we propose a Robust Clustering loss (RC) to make the deep networks focus on clean samples instead of noisy ones. Besides, a simple yet effective multimodal loss function, called Multimodal Contrastive loss (MC), is proposed to maximize the mutual information between different modalities, thus alleviating the interference of noisy samples and cross-modal discrepancy. Extensive experiments are conducted on four widely-used multimodal datasets to demonstrate the effectiveness of the proposed approach by comparing to 14 state-of-the-art methods.

引用

页码：5399 / 5409

页数：11

共 63 条

[61] Triterpenoids with α-glucosidase inhibitory activity from Artemisia argyi
Zhang, Lai-Bin
Chang, Jia-Jing
Guo, Li-Min
Lv, Jie-Li
[J]. JOURNAL OF ASIAN NATURAL PRODUCTS RESEARCH, 2020, 22 (03) : 241 - 248
[62] Deep Supervised Cross-modal Retrieval
Zhen, Liangli
Hu, Peng
Wang, Xu
Peng, Dezhong
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
[63] Zhou Tao, 2020, P IEEE CVF C COMP VI, P10277

← 1 2 3 4 5 6 7 →