Practical Lessons for Gathering Quality Labels at Scale

被引：9

作者：

Alonso, Omar ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2015年

关键词：

Labeling; crowdsourcing; inter-rater agreement; debugging; Captchas; worker reliability; experimental design;

D O I：

10.1145/2766462.2776778

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Information retrieval researchers and engineers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. To gather useful results, a successful labeling task relies on many different elements: clear instructions, user interface design, representative high-quality datasets, appropriate inter-rater agreement metrics, work quality checks, and channels for worker feedback. Furthermore, designing and implementing tasks that produce and use several thousands or millions of labels is different than conducting small scale research investigations. In this paper we present a perspective for collecting high quality labels with an emphasis on practical problems and scalability. We focus on three main topics: programming crowds, debugging tasks with low agreement, and algorithms for quality control. We show examples from an industrial setting.

引用

页码：1089 / 1092

页数：4