Crowdsourcing Database Systems: Overview and Challenges

被引:25
作者
Chai, Chengliang [1 ]
Fan, Ju [2 ]
Li, Guoliang [1 ]
Wang, Jiannan [3 ]
Zheng, Yudian [4 ]
机构
[1] Tsinghua Univ, Dept Comp Sci, Beijing, Peoples R China
[2] Renmin Univ, Dept Comp Sci, Beijing, Peoples R China
[3] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC, Canada
[4] Twitter, San Francisco, CA USA
来源
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019) | 2019年
关键词
MANAGEMENT;
D O I
10.1109/ICDE.2019.00237
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many data management and analytics tasks, such as entity resolution, cannot be solely addressed by automated processes. Crowdsourcing is an effective way to harness the human cognitive ability to process these computer-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower, we can easily involve hundreds of thousands of ordinary workers (i.e., the crowd) to address these computer-hard tasks. However it is rather inconvenient to interact with the crowdsourcing platforms, because the platforms require one to set parameters and even write codes. Inspired by traditional DBMS, crowdsourcing database systems have been proposed and widely studied to encapsulate the complexities of interacting with the crowd. In this tutorial, we will survey and synthesize a wide spectrum of existing studies on crowdsourcing database systems. We first give an overview of crowdsourcing, and then summarize the fundamental techniques in designing crowdsourcing databases, including task design, truth inference, task assignment, answer reasoning and latency reduction. Next we review the techniques on designing crowdsourced operators, including selection, join, sort, top-k, max/min, count, collect, and fill. Finally, we discuss the emerging challenges.
引用
收藏
页码:2052 / 2055
页数:4
相关论文
共 27 条
[1]   Human Factors in Crowdsourcing [J].
Amer-Yahia, Sihem ;
Roy, Senjuti Basu .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (13) :1615-1618
[2]  
Chai C., 2018, VLDB J, P1
[3]  
Chai C., 2018, ICDE
[4]  
Chai C., 2018, ABS180604968 CORR
[5]  
Chen L, 2015, PROC INT CONF DATA, P1527, DOI 10.1109/ICDE.2015.7113418
[6]   Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services [J].
Das, Sanjib ;
Suganthan, Paul G. C. ;
Doan, AnHai ;
Naughton, Jeffrey F. ;
Krishnan, Ganesh ;
Deep, Rohit ;
Arcaute, Esteban ;
Raghavendra, Vijay ;
Park, Youngchoon .
SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, :1431-1446
[7]  
Das Sarma A, 2014, PROC INT CONF DATA, P964, DOI 10.1109/ICDE.2014.6816715
[8]  
Davidson S. B., 2013, Proceedings of the 16th International Conference on Database Theory, P225, DOI DOI 10.1145/2448496.2448524
[9]  
Doan A, 2011, PROC VLDB ENDOW, V4, P1508
[10]   CrowdOp: Query Optimization for Declarative Crowdsourcing Systems [J].
Fan, Ju ;
Zhang, Meihui ;
Kok, Stanley ;
Lu, Meiyu ;
Ooi, Beng Chin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (08) :2078-2092