Sample Size Planning for Replication Studies: The Devil Is in the Design

被引:20
作者
Anderson, Samantha F. [1 ]
Kelley, Ken [2 ]
机构
[1] Arizona State Univ, Dept Psychol, 950 South McAllister Ave, Tempe, AZ 85281 USA
[2] Univ Notre Dame, Dept IT Analyt & Operat, Mendoza Coll Business, Notre Dame, IN 46556 USA
关键词
replication; sample size; statistical power; accuracy in parameter estimation; meta-analysis; METAANALYTIC INTERVAL ESTIMATION; STANDARDIZED MEAN DIFFERENCE; STATISTICAL POWER; PARAMETER-ESTIMATION; NULL-HYPOTHESIS; CONCEPTUAL REPLICATIONS; PSYCHOLOGICAL-RESEARCH; PUBLICATION BIAS; P-VALUES; ACCURACY;
D O I
10.1037/met0000520
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Translational Abstract Replication is fundamental to science and has been receiving increased attention in psychology and related fields. Whether a replication study is deemed a "success" or a "failure" has typically been deduced from the result of a null hypothesis significance test, wherein results reaching conventional criteria for statistical significance are considered successful replications and nonsignificant results are deemed failures. Recently, scientists have been encouraged to consider other approaches, consistent with alternative goals for the replication study. However, study design, and specifically sample size planning, has often been absent from the discussion. Sample size planning should be consistent with the particular replication goal and analysis method that will be used. In the present article, we articulate four different goals for replication and we present formal sample size planning guidance for each goal. We include empirical examples and computer syntax to demonstrate each procedure in practice. Replication is central to scientific progress. Because of widely reported replication failures, replication has received increased attention in psychology, sociology, education, management, and related fields in recent years. Replication studies have generally been assessed dichotomously, designated either a "success" or "failure" based entirely on the outcome of a null hypothesis significance test (i.e., p < .05 or p > .05, respectively). However, alternative definitions of success depend on researchers' goals for the replication. Previous work on alternative definitions for success has focused on the analysis phase of replication. However, the design of the replication is also important, as emphasized with the adage, "an ounce of prevention is better than a pound of cure." One critical component of design often ignored or oversimplified in replication studies is sample size planning, indeed, the details here are crucial. Sample size planning for replication studies should correspond to the method by which success will be evaluated. Researchers have received little guidance, some of which is misguided, on sample size planning for replication goals other than the aforementioned dichotomous null hypothesis significance testing approach. In this article, we describe four different replication goals. Then, we formalize sample size planning methods for each of the four goals. This article aims to provide clarity on the procedures for sample size planning for each goal, with examples and syntax provided to show how each procedure can be used in practice.
引用
收藏
页码:844 / 867
页数:25
相关论文
共 134 条
[1]   Estimating the reproducibility of psychological science [J].
Aarts, Alexander A. ;
Anderson, Joanna E. ;
Anderson, Christopher J. ;
Attridge, Peter R. ;
Attwood, Angela ;
Axt, Jordan ;
Babel, Molly ;
Bahnik, Stepan ;
Baranski, Erica ;
Barnett-Cowan, Michael ;
Bartmess, Elizabeth ;
Beer, Jennifer ;
Bell, Raoul ;
Bentley, Heather ;
Beyan, Leah ;
Binion, Grace ;
Borsboom, Denny ;
Bosch, Annick ;
Bosco, Frank A. ;
Bowman, Sara D. ;
Brandt, Mark J. ;
Braswell, Erin ;
Brohmer, Hilmar ;
Brown, Benjamin T. ;
Brown, Kristina ;
Bruening, Jovita ;
Calhoun-Sauls, Ann ;
Callahan, Shannon P. ;
Chagnon, Elizabeth ;
Chandler, Jesse ;
Chartier, Christopher R. ;
Cheung, Felix ;
Christopherson, Cody D. ;
Cillessen, Linda ;
Clay, Russ ;
Cleary, Hayley ;
Cloud, Mark D. ;
Cohn, Michael ;
Cohoon, Johanna ;
Columbus, Simon ;
Cordes, Andreas ;
Costantini, Giulio ;
Alvarez, Leslie D. Cramblet ;
Cremata, Ed ;
Crusius, Jan ;
DeCoster, Jamie ;
DeGaetano, Michelle A. ;
Della Penna, Nicolas ;
den Bezemer, Bobby ;
Deserno, Marie K. .
SCIENCE, 2015, 349 (6251)
[2]  
American Statistical Association Task Force, 2021, AMSTAT NEWS
[3]  
Anderson S.F., 2020, BIAS UNCERTAINTY COR
[4]   Misinterpreting p: The Discrepancy Between p Values and the Probability the Null Hypothesis is True, the Influence of Multiple Testing, and Implications for the Replication Crisis [J].
Anderson, Samantha F. .
PSYCHOLOGICAL METHODS, 2020, 25 (05) :596-609
[5]   Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty [J].
Anderson, Samantha F. ;
Kelley, Ken ;
Maxwell, Scott E. .
PSYCHOLOGICAL SCIENCE, 2017, 28 (11) :1547-1562
[6]   Addressing the "Replication Crisis": Using Original Studies to Design Replication Studies with Appropriate Statistical Power [J].
Anderson, Samantha F. ;
Maxwell, Scott E. .
MULTIVARIATE BEHAVIORAL RESEARCH, 2017, 52 (03) :305-324
[7]   There's More Than One Way to Conduct a Replication Study: Beyond Statistical Significance [J].
Anderson, Samantha F. ;
Maxwell, Scott E. .
PSYCHOLOGICAL METHODS, 2016, 21 (01) :1-12
[8]  
[Anonymous], Testing Statistical Hypotheses, DOI [DOI 10.1007/978-3-030-70578-7, 10.1007/978-3-030-70578-7]
[9]   TEST OF SIGNIFICANCE IN PSYCHOLOGICAL RESEARCH [J].
BAKAN, D .
PSYCHOLOGICAL BULLETIN, 1966, 66 (06) :423-&
[10]   Researchers' Intuitions About Power in Psychological Research [J].
Bakker, Marjan ;
Hartgerink, Chris H. J. ;
Wicherts, Jelte M. ;
van der Maas, Han L. J. .
PSYCHOLOGICAL SCIENCE, 2016, 27 (08) :1069-1077