Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels

被引:0
作者
Zhang, Zhilu [1 ]
Sabuncu, Mert R. [1 ]
机构
[1] Cornell Univ, Meinig Sch Biomed Engn, Elect & Comp Engn, Ithaca, NY 14853 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results from experiments conducted with CIFAR-10, CIFAR-100 and FASHION-MNIST datasets and synthetically generated noisy labels.
引用
收藏
页数:11
相关论文
共 42 条
  • [1] [Anonymous], 2017, 5 INT C LEARN REPR I
  • [2] [Anonymous], 2014, arXiv preprint arXiv:1406.2080 2
  • [3] Azadi S., 2015, Auxiliary image regularization for deep cnns with noisy labels
  • [4] Bazaraa Mokhtar S, 2013, Nonlinear programming: theory and algorithms
  • [5] Box GE., 1964, JOURNAL OF THE ROYAL, V26, P211, DOI [DOI 10.1016/J.ACA.2009.06.015, DOI 10.1111/J.2517-6161.1964.TB00553.X, 10.1111/j.2517-6161.1964.tb00553.x]
  • [6] Support Vector Machines with the Ramp Loss and the Hard Margin Loss
    Brooks, J. Paul
    [J]. OPERATIONS RESEARCH, 2011, 59 (02) : 467 - 479
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] MAXIMUM Lq-LIKELIHOOD ESTIMATION
    Ferrari, Davide
    Yang, Yuhong
    [J]. ANNALS OF STATISTICS, 2010, 38 (02) : 753 - 783
  • [9] Classification in the Presence of Label Noise: a Survey
    Frenay, Benoit
    Verleysen, Michel
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) : 845 - 869
  • [10] Making risk minimization tolerant to label noise
    Ghosh, Aritra
    Manwani, Naresh
    Sastry, P. S.
    [J]. NEUROCOMPUTING, 2015, 160 : 93 - 107