Towards harnessing feature embedding for robust learning with noisy labels

被引:4
作者
Zhang, Chuang [1 ]
Shen, Li [2 ]
Yang, Jian [3 ]
Gong, Chen [1 ,4 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, PCA Lab,Minist Educ, Key Lab Intelligent Percept & Syst High Dimens In, Nanjing, Peoples R China
[2] JD Explore Acad, Beijing, Peoples R China
[3] Nankai Univ, Coll Comp Sci, Tianjin, Peoples R China
[4] Jiangsu Key Lab Image & Video Understanding Socia, Nanjing, Peoples R China
关键词
Deep learning; Robust learning; Classification; Label noise;
D O I
10.1007/s10994-022-06197-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods. To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels. However, we observe that the model will make mistakes during label prediction, resulting in unsatisfactory performance. By contrast, the produced features in the early stage of learning show better robustness. Inspired by this observation, in this paper, we propose a novel feature embedding-based method for deep learning with label noise, termed LabElNoiseDilution (LEND). To be specific, we first compute a similarity matrix based on current embedded features to capture the local structure of training data. Then, the noisy supervision signals carried by mislabeled data are overwhelmed by nearby correctly labeled ones (i.e., label noise dilution), of which the effectiveness is guaranteed by the inherent robustness of feature embedding. Finally, the training data with diluted labels are further used to train a robust classifier. Empirically, we conduct extensive experiments on both synthetic and real-world noisy datasets by comparing our LEND with several representative robust learning approaches. The results verify the effectiveness of our LEND.
引用
收藏
页码:3181 / 3201
页数:21
相关论文
共 50 条
[1]  
Arpit D, 2017, PR MACH LEARN RES, V70
[2]  
Bahri D, 2020, PR MACH LEARN RES, V119
[3]  
Bai Y., 2021, ARXIV PREPRINT ARXIV
[4]   Me-Momentum: Extracting Hard Confident Examples from Noisily Labeled Data [J].
Bai, Yingbin ;
Liu, Tongliang .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9292-9301
[5]  
Berthelot D, 2019, ADV NEUR IN, V32
[6]  
Chen Pengfei, 2019, P MACHINE LEARNING R, V97
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]  
Ghosh A, 2017, AAAI CONF ARTIF INTE, P1919
[9]  
Goldberger Jacob, 2017, 5 INT C LEARNING REP
[10]   Learning with Inadequate and Incorrect Supervision [J].
Gong, Chen ;
Zhang, Hengmin ;
Yang, Jian ;
Tao, Dacheng .
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, :889-894