Improving label quality in crowdsourcing using deep co-teaching-based noise correction

被引:0
作者
Kang Zhu
Siqing Xue
Liangxiao Jiang
机构
[1] China University of Geosciences,School of Computer Science
[2] Ministry of Education,Key Laboratory of Artificial Intelligence
来源
International Journal of Machine Learning and Cybernetics | 2023年 / 14卷
关键词
Crowdsourcing; Noise correction; Co-teaching; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
In the crowdsourcing scenario, repeated labeling is employed to obtain each instance’s multiple noisy label set from different crowd workers on the Internet, and then a ground truth inference method is used to obtain its integrated label. However, no matter which ground truth inference method is used, a certain level of noise remains in the integrated labels. To improve the quality of the integrated labels, a number of noise correction methods have been proposed in recent years. However, to the best of our knowledge, almost all these methods filter out too many instances that are regarded as noise instances and thus can only use a few clean instances to learn classifiers for noise correction. In this paper, we propose a two-stage noise correction method called deep co-teaching-based noise correction (DCTNC), which can learn not only from clean instances but also from noise instances adaptively. In the first stage, original instances are split into a clean set and a noise set according to the confusion level of their multiple noisy label sets. In the second stage, we at first train a deep network on the clean set and then use it to guide the training of another two deep networks on the noise set through an improved co-teaching algorithm. Finally, we use the trained three deep networks to correct the instances in the noise set. The experimental results on eleven simulated datasets and one real-world dataset show that the proposed DCTNC achieves new state-of-the-art results.
引用
收藏
页码:3641 / 3654
页数:13
相关论文
共 101 条
[1]  
Sun L(2021)Rc-chain: reputation-based crowdsourcing blockchain for vehicular networks J Netw Comput Appl 176 163-174
[2]  
Yang Q(2021)Receivers location privacy in avionic crowdsourced networks: issues and countermeasures J Netw Comput Appl 174 2521-2538
[3]  
Chen X(2019)Domain-weighted majority voting for crowdsourcing IEEE Trans Neural Netw Learn Syst 30 20-28
[4]  
Chen Z(2020)Label similarity-based weighted soft majority voting and pairing for crowdsourcing Knowl Inf Syst 62 1297-1322
[5]  
Sciancalepore S(2021)Differential evolution-based weighted soft majority voting for crowdsourcing Eng Appl Artif Intell 106 2480-2494
[6]  
Alhazbi S(1979)Maximum likelihood estimation of observer error-rates using the em algorithm Appl Stat 28 6558-6568
[7]  
Pietro RD(2010)Learning from crowds J Mach Learn Res 11 1080-1085
[8]  
Tao D(2019)Max-margin majority voting for learning from crowds IEEE Trans Pattern Anal Mach Intell 41 149-162
[9]  
Cheng J(2021)Crowdsourcing aggregation with deep Bayesian learning Sci China Inf Sci 64 1675-1688
[10]  
Yu Z(2022)Learning from crowds with multiple noisy label distribution propagation IEEE Trans Neural Netw Learn Syst 33 96-103