Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels

被引：0

作者：

Zhang, Zhilu ^{[1
]}

Sabuncu, Mert R. ^{[1
]}

机构：

[1] Cornell Univ, Meinig Sch Biomed Engn, Elect & Comp Engn, Ithaca, NY 14853 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results from experiments conducted with CIFAR-10, CIFAR-100 and FASHION-MNIST datasets and synthetically generated noisy labels.

引用

页数：11

共 42 条

[1] [Anonymous], 2017, 5 INT C LEARN REPR I
[2] [Anonymous], 2014, arXiv preprint arXiv:1406.2080 2
[3] Azadi S., 2015, Auxiliary image regularization for deep cnns with noisy labels
[4] Bazaraa Mokhtar S, 2013, Nonlinear programming: theory and algorithms
[5] Box GE., 1964, JOURNAL OF THE ROYAL, V26, P211, DOI [DOI 10.1016/J.ACA.2009.06.015, DOI 10.1111/J.2517-6161.1964.TB00553.X, 10.1111/j.2517-6161.1964.tb00553.x]
[6] Support Vector Machines with the Ramp Loss and the Hard Margin Loss
Brooks, J. Paul
[J]. OPERATIONS RESEARCH, 2011, 59 (02) : 467 - 479
[7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8] MAXIMUM Lq-LIKELIHOOD ESTIMATION
Ferrari, Davide
Yang, Yuhong
[J]. ANNALS OF STATISTICS, 2010, 38 (02) : 753 - 783
[9] Classification in the Presence of Label Noise: a Survey
Frenay, Benoit
Verleysen, Michel
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) : 845 - 869
[10] Making risk minimization tolerant to label noise
Ghosh, Aritra
Manwani, Naresh
Sastry, P. S.
[J]. NEUROCOMPUTING, 2015, 160 : 93 - 107

← 1 2 3 4 5 →