Reputation as a sufficient condition for data quality on Amazon Mechanical Turk

被引:1288
作者
Peer, Eyal [1 ]
Vosgerau, Joachim [2 ]
Acquisti, Alessandro [3 ]
机构
[1] Bar Ilan Univ, Grad Sch Business Adm, IL-52900 Ramat Gan, Israel
[2] Tilburg Univ, Sch Econ & Management, NL-5000 LE Tilburg, Netherlands
[3] Carnegie Mellon Univ, Heinz Coll, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Online research; Amazon Mechanical Turk; Data quality; Reputation; SCALE;
D O I
10.3758/s13428-013-0434-y
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
Data quality is one of the major concerns of using crowdsourcing websites such as Amazon Mechanical Turk (MTurk) to recruit participants for online behavioral studies. We compared two methods for ensuring data quality on MTurk: attention check questions (ACQs) and restricting participation to MTurk workers with high reputation (above 95% approval ratings). In Experiment 1, we found that high-reputation workers rarely failed ACQs and provided higher-quality data than did low-reputation workers; ACQs improved data quality only for low-reputation workers, and only in some cases. Experiment 2 corroborated these findings and also showed that more productive high-reputation workers produce the highest-quality data. We concluded that sampling high-reputation workers can ensure high-quality data without having to resort to using ACQs, which may lead to selection bias if participants who fail ACQs are excluded post-hoc.
引用
收藏
页码:1023 / 1031
页数:9
相关论文
共 15 条
[1]   Seriousness checks are useful to improve data validity in online research [J].
Aust, Frederik ;
Diedenhofen, Birk ;
Ullrich, Sebastian ;
Musch, Jochen .
BEHAVIOR RESEARCH METHODS, 2013, 45 (02) :527-535
[2]   Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? [J].
Buhrmester, Michael ;
Kwang, Tracy ;
Gosling, Samuel D. .
PERSPECTIVES ON PSYCHOLOGICAL SCIENCE, 2011, 6 (01) :3-5
[3]   THE EFFICIENT ASSESSMENT OF NEED FOR COGNITION [J].
CACIOPPO, JT ;
PETTY, RE ;
KAO, CF .
JOURNAL OF PERSONALITY ASSESSMENT, 1984, 48 (03) :306-307
[4]   Nonnaivete among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers [J].
Chandler, Jesse ;
Mueller, Pam ;
Paolacci, Gabriele .
BEHAVIOR RESEARCH METHODS, 2014, 46 (01) :112-130
[5]  
Downs JS, 2010, CHI2010: PROCEEDINGS OF THE 28TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P2399
[6]   MEASURING SOCIAL DESIRABILITY - SHORT FORMS OF THE MARLOWE-CROWNE SOCIAL DESIRABILITY SCALE [J].
FISCHER, DG ;
FICK, C .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1993, 53 (02) :417-424
[7]   Data Collection in a Flat World: The Strengths and Weaknesses of Mechanical Turk Samples [J].
Goodman, Joseph K. ;
Cryder, Cynthia E. ;
Cheema, Amar .
JOURNAL OF BEHAVIORAL DECISION MAKING, 2013, 26 (03) :213-224
[8]   A very brief measure of the Big-Five personality domains [J].
Gosling, SD ;
Rentfrow, PJ ;
Swann, WB .
JOURNAL OF RESEARCH IN PERSONALITY, 2003, 37 (06) :504-528
[9]   CONSEQUENCES OF PREJUDICE AGAINST NULL HYPOTHESIS [J].
GREENWALD, AG .
PSYCHOLOGICAL BULLETIN, 1975, 82 (01) :1-19
[10]  
Hakstian R. A., 1976, PSYCHOMETRIKA, V41, P219