Certainty weighted voting-based noise correction for crowdsourcing

被引:5
作者
Li, Huiru [1 ]
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowdsourcing; Noise correction; Certainty; Class-dependent; Instance-dependent; Weighted voting; MODEL QUALITY; IMPROVING DATA; TOOL;
D O I
10.1016/j.patcog.2024.110325
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In crowdsourcing scenarios, we can obtain each instance's multiple noisy label set from different workers and then use a ground truth inference algorithm to infer its integrated label. Despite the effectiveness of ground truth inference algorithms, there is still a certain level of noise in integrated labels. To reduce the impact of noise, many noise correction algorithms have been proposed in recent years. To the best of our knowledge, almost all these algorithms assume that workers have the same labeling certainty on different classes and instances. However, it is rarely true in reality due to the differences in workers' individual preferences and cognitive abilities. In this paper, we argue that the labeling certainty of a worker should be class -dependent and instance -dependent. Based on this premise, we propose a certainty weighted voting -based noise correction (CWVNC) algorithm. At first, we use the consistency between worker -labeled labels and integrated labels on different classes to estimate the class -dependent certainty. Then, we train a probability -based classifier on the instances labeled by each worker separately and use it to estimate the instance -dependent certainty. Finally, we correct the integrated label of each instance by weighted voting based on class -dependent certainty and instance -dependent certainty. When the proposed algorithm CWVNC is examined, the average noise ratio of CWVNC on 34 simulated datasets is equal to 15.08%, and on two real -world datasets "Income"and "Music_genre"the noise ratio is equal to 25.77% and 26.94%, respectively. The results show that CWVNC significantly outperforms all other state-of-the-art noise correction algorithms used for comparison.
引用
收藏
页数:9
相关论文
共 50 条
[31]   Label noise correction for crowdsourcing using dynamic resampling [J].
Zhang, Jing ;
Jiang, Xiaoqian ;
Tian, Nianshang ;
Wu, Ming .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[32]   Crowdsourcing quality evaluation strategies based on voting consistency [J].
Yue, De-Jun ;
Yu, Ge ;
Shen, De-Rong ;
Yu, Xiao-Cong .
Dongbei Daxue Xuebao/Journal of Northeastern University, 2014, 35 (08) :1097-1101
[33]   Improving data and model quality in crowdsourcing using co-training-based noise correction [J].
Dong, Yu ;
Jiang, Liangxiao ;
Li, Chaoqun .
INFORMATION SCIENCES, 2022, 583 :174-188
[34]   Weighted voting clustering ensemble based on maximum cohesion [J].
Chen, Xiao-Yun ;
Chen, Gang .
Kongzhi yu Juece/Control and Decision, 2014, 29 (02) :236-240
[35]   Evaluation of Effectiveness of Majority Voting Method in Crowdsourcing-based Subtitling Methods [J].
Mera, Kantaro ;
Itano, Ryuya ;
Koita, Takahiro .
IEICE COMMUNICATIONS EXPRESS, 2022, 11 (06) :324-329
[36]   Weighted Electronic Voting System with Homomorphic Encryption Based on SEAL [J].
Yang Y.-T. ;
Zhao Y. ;
Zhang Q.-L. ;
Ma Y.-J. ;
Gao Y. .
Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (04) :711-723
[37]   WVMDA: Predicting miRNA-Disease Association Based on Weighted Voting [J].
Zhang, Zhen-Wei ;
Gao, Zhen ;
Zheng, Chun-Hou ;
Li, Lei ;
Qi, Su-Min ;
Wang, Yu-Tian .
FRONTIERS IN GENETICS, 2021, 12
[38]   Skeleton-Based Human Action Recognition by Pose Specificity and Weighted Voting [J].
Liu, Tingting ;
Wang, Jiaole ;
Hutchinson, Seth ;
Meng, Max Q-H .
INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2019, 11 (02) :219-234
[39]   Skeleton-Based Human Action Recognition by Pose Specificity and Weighted Voting [J].
Tingting Liu ;
Jiaole Wang ;
Seth Hutchinson ;
Max Q.-H. Meng .
International Journal of Social Robotics, 2019, 11 :219-234
[40]   Regression Analysis of Residential Electricity Consumption Behavior Based on Weighted Voting Ensemble Clustering [J].
Yan Q. ;
Li Y. ;
Fan Y. ;
Chen Y. ;
Guo J. .
Dianwang Jishu/Power System Technology, 2021, 45 (11) :4435-4443