LongReMix: Robust learning with high confidence samples in a noisy label environment

被引：60

作者：

Cordeiro, Filipe R. ^{[3
]}

Sachdeva, Ragav ^{[2
]}

Belagiannis, Vasileios ^{[5
]}

Reid, Ian ^{[1
]}

Carneiro, Gustavo ^{[1
,4
]}

机构：

[1] Australian Inst Machine Learning, Sch Comp Sci, Adelaide, Australia

[2] Univ Oxford, Dept Engn Sci, Visual Geometry Grp, Oxford, England

[3] Univ Fed Rural Pernambuco, Dept Comp, Visual Comp Lab, Recife, Brazil

[4] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, England

[5] Otto Guericke Univ Magdeburg, Magdeburg, Germany

来源：

PATTERN RECOGNITION | 2023年 / 133卷

基金：

澳大利亚研究理事会;

关键词：

Noisy label learning; Deep learning; Empirical vicinal risk; Semi-supervised learning;

D O I：

10.1016/j.patcog.2022.109013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

State-of-the-art noisy-label learning algorithms rely on an unsupervised learning to classify training sam-ples as clean or noisy, followed by a semi-supervised learning (SSL) that minimises the empirical vicinal risk using a labelled set formed by samples classified as clean, and an unlabelled set with samples clas-sified as noisy. The classification accuracy of such noisy-label learning methods depends on the precision of the unsupervised classification of clean and noisy samples, and the robustness of SSL to small clean sets. We address these points with a new noisy-label training algorithm, called LongReMix, which im-proves the precision of the unsupervised classification of clean and noisy samples and the robustness of SSL to small clean sets with a two-stage learning process. The stage one of LongReMix finds a small but precise high-confidence clean set, and stage two augments this high-confidence clean set with new clean samples and oversamples the clean data to increase the robustness of SSL to small clean sets. We test LongReMix on CIFAR-10 and CIFAR-10 0 with introduced synthetic noisy labels, and the real-world noisy -label benchmarks CNWL (Red Mini-ImageNet), WebVision, Clothing1M, and Food101-N. The results show that our LongReMix produces significantly better classification accuracy than competing approaches, par-ticularly in high noise rate problems. Furthermore, our approach achieves state-of-the-art performance in most datasets. The code is available at https://github.com/filipe-research/LongReMix .(c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：14

共 62 条

[1] ReLaB: Reliable Label Bootstrapping for Semi-Supervised Learning [J].

Albert, Paul ;

Ortego, Diego ;

Arazo, Eric ;

O'Connor, Noel ;

McGuinness, Kevin .

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,

[2]

Arazo E., 2020, IEEE IJCNN, P1

[3]

Arazo E, 2019, PR MACH LEARN RES, V97

[4]

Berthelot D, 2019, ADV NEUR IN, V32

[5]

Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29

[6]

Chen Pengfei, 2019, P MACHINE LEARNING R, V97

[7]

Chen Tianlong, 2020, NEURIPS, V33

[8] Semi-supervised Deep Learning with Memory [J].

Chen, Yanbei ;

Zhu, Xiatian ;

Gong, Shaogang .

COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 :275-291

[9]

Demsar J, 2006, J MACH LEARN RES, V7, P1

[10]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 6 7 →