A Two-stage Iterative Approach to Improve Crowdsourcing-Based Relevance Assessment

被引:0
作者
Yongzhen Wang
Yan Lin
Zheng Gao
Yan Chen
机构
[1] Dalian Maritime University,School of Maritime Economics and Management
[2] Indiana University Bloomington,School of Informatics, Computing and Engineering
来源
Arabian Journal for Science and Engineering | 2019年 / 44卷
关键词
Crowdsourcing; Relevance assessment; Ensemble classifier; Expectation maximization; Expert guidance;
D O I
暂无
中图分类号
学科分类号
摘要
Crowdsourcing has emerged as a viable platform to implement the relevance assessment of information retrieval. However, since crowdsourcing touches independent and anonymous workers, the quality control on assessment results has become a hotspot within academia and industry. For this problem, we propose a two-stage iterative approach by integrating the ensemble classifier and expert guidance. Specifically, in first stage an ensemble classifier is employed to select unreliable assessment objects for experts to validate. Then in second stage, the expectation maximization is utilized to update all assessment results in terms of the validation feedback. This loop continues until the cost limit is reached. Simulation experiment demonstrates that compared with existing solutions, our approach can eliminate more noise and thereby achieve a higher accuracy, while maintaining an acceptable running time and a low labor cost.
引用
收藏
页码:3155 / 3172
页数:17
相关论文
共 122 条
  • [1] Li G(2016)Crowdsourced data management: a survey IEEE Trans. Knowl. Data Eng. 28 2296-2319
  • [2] Wang J(2017)Crowdsourcing for translational research: analysis of biomarker expression using cancer microarrays Br. J. Cancer 116 237-15
  • [3] Zheng Y(2017)On crowdsourcing relevance magnitudes for information retrieval evaluation ACM Trans. Inf. Syst. (TOIS) 35 19-1066
  • [4] Franklin MJ(2008)Crowdsourcing for relevance evaluation ACM SigIR Forum 42 9-120
  • [5] Lawson J(2012)Using crowdsourcing for trec relevance assessment Inf. Process. Manag. 48 1053-576
  • [6] Robinson-Vyas RJ(2013)Implementing crowdsourcing-based relevance experimentation: an industrial perspective Inf. Retriev. 16 101-81
  • [7] McQuillan JP(2016)Learning from crowdsourced labeled data: a survey Artif. Intell. Rev. 46 543-1107
  • [8] Paterson A(2013)Quality control in crowdsourcing systems: issues and directions IEEE Internet Comput. 17 76-103
  • [9] Christie S(2015)Active learning with imbalanced multiple noisy labeling IEEE Trans. Cybern. 45 1095-1506
  • [10] Kidza-Griffiths M(2016)Noise filtering to improve data and model quality for crowdsourcing Knowl.-Based Syst. 107 96-518