Improving label quality in crowdsourcing using deep co-teaching-based noise correction

被引:0
作者
Kang Zhu
Siqing Xue
Liangxiao Jiang
机构
[1] China University of Geosciences,School of Computer Science
[2] Ministry of Education,Key Laboratory of Artificial Intelligence
来源
International Journal of Machine Learning and Cybernetics | 2023年 / 14卷
关键词
Crowdsourcing; Noise correction; Co-teaching; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
In the crowdsourcing scenario, repeated labeling is employed to obtain each instance’s multiple noisy label set from different crowd workers on the Internet, and then a ground truth inference method is used to obtain its integrated label. However, no matter which ground truth inference method is used, a certain level of noise remains in the integrated labels. To improve the quality of the integrated labels, a number of noise correction methods have been proposed in recent years. However, to the best of our knowledge, almost all these methods filter out too many instances that are regarded as noise instances and thus can only use a few clean instances to learn classifiers for noise correction. In this paper, we propose a two-stage noise correction method called deep co-teaching-based noise correction (DCTNC), which can learn not only from clean instances but also from noise instances adaptively. In the first stage, original instances are split into a clean set and a noise set according to the confusion level of their multiple noisy label sets. In the second stage, we at first train a deep network on the clean set and then use it to guide the training of another two deep networks on the noise set through an improved co-teaching algorithm. Finally, we use the trained three deep networks to correct the instances in the noise set. The experimental results on eleven simulated datasets and one real-world dataset show that the proposed DCTNC achieves new state-of-the-art results.
引用
收藏
页码:3641 / 3654
页数:13
相关论文
共 101 条
[11]  
Yue K(2016)Multi-class ground truth inference in crowdsourcing with clustering IEEE Trans Knowl Data Eng 28 131-167
[12]  
Wang L(2016)Label noise correction and application in crowdsourcing Expert Syst Appl 66 387-396
[13]  
Tao F(2018)Improving crowdsourced label quality using noise correction IEEE Trans Neural Netw Learn Syst 29 845-869
[14]  
Jiang L(2016)Noise filtering to improve data and model quality for crowdsourcing Knowl Based Syst 107 424-434
[15]  
Li C(1999)Identifying mislabeled training data J Artif Intell Res 11 985-999
[16]  
Tao F(2007)Improving software quality prediction by noise filtering techniques J Comput Sci Technol 22 184-191
[17]  
Jiang L(2014)Classification in the presence of label noise: a survey IEEE Trans Neural Netw Learn Syst 25 803-814
[18]  
Li C(2021)A self-training-based label noise correction algorithm for crowdsourcing Acta Automatica Sinica 49 10-18
[19]  
Dawid AP(2021)Resampling-based noise correction for crowdsourcing J Exp Theor Artif Intell 33 543-576
[20]  
Skene AM(2019)Noise correction to improve data and model quality for crowdsourcing Eng Appl Artif Intell 82 1-30