Trusted-Data-Guided Label Enhancement on Noisy Labels

被引:17
作者
Xu, Ning [1 ,2 ]
Li, Jia-Yu [1 ,2 ]
Liu, Yun-Peng [1 ,2 ]
Geng, Xin [1 ,2 ]
机构
[1] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China
[2] Southeast Univ, Key Lab Comp Network & Informat Integrat, Minist Educ, Nanjing 211189, Peoples R China
基金
美国国家科学基金会; 中国博士后科学基金;
关键词
Noise measurement; Training; Probabilistic logic; Labeling; Training data; Task analysis; Supervised learning; Label distribution learning (LDL); label enhancement (LE); noisy labels; trusted data;
D O I
10.1109/TNNLS.2022.3162316
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Label distribution covers a certain number of labels, representing the degree to which each label describes the instance. Label enhancement (LE) is a procedure of recovering the label distribution from the logical labels in the training data, the purpose of which is to better depict the label ambiguity through label distribution. However, data annotation inevitably introduces label noise, and it is extremely challenging to implement LE on corrupted labels. To deal with this problem, one way to recover the label distribution from the corrupted labels is to be guided by a small batch of trusted data. In this article, a novel LE method named TALEN is proposed via recovering and progressively refining label distribution guided by trusted data. Specifically, an LE process is applied to the untrusted data to select samples with a clean label. In addition, a combined loss function is designed to train the predictive model for classification. Experiments on datasets with synthetic label noise validate the feasibility of identifying clean labels via the recovered label distribution. Furthermore, experimental results on both synthetic label noise and real-world label noise on image datasets and additional experiments on text datasets show a clear advantage of TALEN over several existing noise-robust learning methods.
引用
收藏
页码:9940 / 9951
页数:12
相关论文
共 71 条
[61]  
Wang Siqi, 2019, Advances in Neural Information Processing Systems, V32
[62]  
Xiao T, 2015, PROC CVPR IEEE, P2691, DOI 10.1109/CVPR.2015.7298885
[63]   Label Enhancement for Label Distribution Learning [J].
Xu, Ning ;
Liu, Yun-Peng ;
Geng, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (04) :1632-1643
[64]  
Xu N, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2926
[65]  
Yang JF, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P3266
[66]   Probabilistic End-to-end Noise Correction for Learning with Noisy Labels [J].
Yi, Kun ;
Wu, Jianxin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7010-7018
[67]  
Zagoruyko S., 2016, P P BRIT MACH VIS C
[68]  
Zhang FW, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2369
[69]   Leveraging Implicit Relative Labeling-Importance Information for Effective Multi-Label Learning [J].
Zhang, Min-Ling ;
Zhang, Qian-Wen ;
Fang, Jun-Peng ;
Li, Yu-Kun ;
Geng, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) :2057-2070
[70]  
Zhou D., 2016, P 2016 C EMP METH NA, P638