Certainty weighted voting-based noise correction for crowdsourcing

被引:5
作者
Li, Huiru [1 ]
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowdsourcing; Noise correction; Certainty; Class-dependent; Instance-dependent; Weighted voting; MODEL QUALITY; IMPROVING DATA; TOOL;
D O I
10.1016/j.patcog.2024.110325
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In crowdsourcing scenarios, we can obtain each instance's multiple noisy label set from different workers and then use a ground truth inference algorithm to infer its integrated label. Despite the effectiveness of ground truth inference algorithms, there is still a certain level of noise in integrated labels. To reduce the impact of noise, many noise correction algorithms have been proposed in recent years. To the best of our knowledge, almost all these algorithms assume that workers have the same labeling certainty on different classes and instances. However, it is rarely true in reality due to the differences in workers' individual preferences and cognitive abilities. In this paper, we argue that the labeling certainty of a worker should be class -dependent and instance -dependent. Based on this premise, we propose a certainty weighted voting -based noise correction (CWVNC) algorithm. At first, we use the consistency between worker -labeled labels and integrated labels on different classes to estimate the class -dependent certainty. Then, we train a probability -based classifier on the instances labeled by each worker separately and use it to estimate the instance -dependent certainty. Finally, we correct the integrated label of each instance by weighted voting based on class -dependent certainty and instance -dependent certainty. When the proposed algorithm CWVNC is examined, the average noise ratio of CWVNC on 34 simulated datasets is equal to 15.08%, and on two real -world datasets "Income"and "Music_genre"the noise ratio is equal to 25.77% and 26.94%, respectively. The results show that CWVNC significantly outperforms all other state-of-the-art noise correction algorithms used for comparison.
引用
收藏
页数:9
相关论文
共 50 条
[21]   Label similarity-based weighted soft majority voting and pairing for crowdsourcing [J].
Fangna Tao ;
Liangxiao Jiang ;
Chaoqun Li .
Knowledge and Information Systems, 2020, 62 :2521-2538
[22]   Worker similarity-based noise correction for crowdsourcing [J].
Hu, Yufei ;
Jiang, Liangxiao ;
Zhang, Wenjun .
INFORMATION SYSTEMS, 2024, 121
[23]   Label distribution similarity-based noise correction for crowdsourcing [J].
Lijuan Ren ;
Liangxiao Jiang ;
Wenjun Zhang ;
Chaoqun Li .
Frontiers of Computer Science, 2024, 18
[24]   Noise correction to improve data and model quality for crowdsourcing [J].
Li, Chaoqun ;
Jiang, Liangxiao ;
Xu, Wenqiang .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 82 :184-191
[25]   Ensemble weighted soft voting truth inference method for crowdsourcing [J].
Zhang H. ;
Shen F. ;
Jiang S. ;
Zhang L. ;
Xu H. .
Qinghua Daxue Xuebao/Journal of Tsinghua University, 2022, 62 (02) :347-354
[26]   A Self-training-based Label Noise Correction Algorithm for Crowdsourcing [J].
Yang Y. ;
Jiang L.-X. ;
Li C.-Q. .
Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (04) :830-844
[27]   Improving label quality in crowdsourcing using deep co-teaching-based noise correction [J].
Kang Zhu ;
Siqing Xue ;
Liangxiao Jiang .
International Journal of Machine Learning and Cybernetics, 2023, 14 :3641-3654
[28]   Improving data and model quality in crowdsourcing using cross-entropy-based noise correction [J].
Xu, Wenqiang ;
Jiang, Liangxiao ;
Li, Chaoqun .
INFORMATION SCIENCES, 2021, 546 :803-814
[29]   Double weighted K-nearest voting for label aggregation in crowdsourcing learning [J].
Jiaye Li ;
Hao Yu ;
Leyuan Zhang ;
Guoqiu Wen .
Multimedia Tools and Applications, 2019, 78 :33357-33374
[30]   Double weighted K-nearest voting for label aggregation in crowdsourcing learning [J].
Li, Jiaye ;
Yu, Hao ;
Zhang, Leyuan ;
Wen, Guoqiu .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (23) :33357-33374