Sample Size Planning for Replication Studies: The Devil Is in the Design

被引:20
作者
Anderson, Samantha F. [1 ]
Kelley, Ken [2 ]
机构
[1] Arizona State Univ, Dept Psychol, 950 South McAllister Ave, Tempe, AZ 85281 USA
[2] Univ Notre Dame, Dept IT Analyt & Operat, Mendoza Coll Business, Notre Dame, IN 46556 USA
关键词
replication; sample size; statistical power; accuracy in parameter estimation; meta-analysis; METAANALYTIC INTERVAL ESTIMATION; STANDARDIZED MEAN DIFFERENCE; STATISTICAL POWER; PARAMETER-ESTIMATION; NULL-HYPOTHESIS; CONCEPTUAL REPLICATIONS; PSYCHOLOGICAL-RESEARCH; PUBLICATION BIAS; P-VALUES; ACCURACY;
D O I
10.1037/met0000520
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Translational Abstract Replication is fundamental to science and has been receiving increased attention in psychology and related fields. Whether a replication study is deemed a "success" or a "failure" has typically been deduced from the result of a null hypothesis significance test, wherein results reaching conventional criteria for statistical significance are considered successful replications and nonsignificant results are deemed failures. Recently, scientists have been encouraged to consider other approaches, consistent with alternative goals for the replication study. However, study design, and specifically sample size planning, has often been absent from the discussion. Sample size planning should be consistent with the particular replication goal and analysis method that will be used. In the present article, we articulate four different goals for replication and we present formal sample size planning guidance for each goal. We include empirical examples and computer syntax to demonstrate each procedure in practice. Replication is central to scientific progress. Because of widely reported replication failures, replication has received increased attention in psychology, sociology, education, management, and related fields in recent years. Replication studies have generally been assessed dichotomously, designated either a "success" or "failure" based entirely on the outcome of a null hypothesis significance test (i.e., p < .05 or p > .05, respectively). However, alternative definitions of success depend on researchers' goals for the replication. Previous work on alternative definitions for success has focused on the analysis phase of replication. However, the design of the replication is also important, as emphasized with the adage, "an ounce of prevention is better than a pound of cure." One critical component of design often ignored or oversimplified in replication studies is sample size planning, indeed, the details here are crucial. Sample size planning for replication studies should correspond to the method by which success will be evaluated. Researchers have received little guidance, some of which is misguided, on sample size planning for replication goals other than the aforementioned dichotomous null hypothesis significance testing approach. In this article, we describe four different replication goals. Then, we formalize sample size planning methods for each of the four goals. This article aims to provide clarity on the procedures for sample size planning for each goal, with examples and syntax provided to show how each procedure can be used in practice.
引用
收藏
页码:844 / 867
页数:25
相关论文
共 134 条
[41]  
Gliklich R. E., 2014, Registries for evaluating patient outcomes: A user's guide
[42]   Aligning statistical and scientific reasoning [J].
Goodman, Steven N. .
SCIENCE, 2016, 352 (6290) :1180-1181
[43]   Valid P-Values Behave Exactly as They Should: Some Misleading Criticisms of P-Values and Their Resolution With S-Values [J].
Greenland, Sander .
AMERICAN STATISTICIAN, 2019, 73 :106-114
[44]   Optimal Sample Sizes for Testing the Equivalence of Two Means [J].
Guo, Jiin-Huarng ;
Chen, Hubert J. ;
Luh, Wei-Ming .
METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2019, 15 (03) :128-136
[45]   Retiring significance: keep hypothesis tests [J].
Haaf, Julia M. ;
Ly, Alexander ;
Wagenmakers, Eric-Jan .
NATURE, 2019, 567 (7749) :461-461
[46]  
Hahn GJ., 1991, Statistical intervals: a guide for practitioners, DOI DOI 10.1002/9780470316771
[47]  
Hedges L.V., 1984, Journal of Educational Statistics, V9, P61, DOI DOI 10.3102/10769986009001061
[48]   More Than One Replication Study Is Needed for Unambiguous Tests of Replication [J].
Hedges, Larry V. ;
Schauer, Jacob M. .
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2019, 44 (05) :543-570
[49]   Statistical Analyses for Studying Replication: Meta-Analytic Perspectives [J].
Hedges, Larry V. ;
Schauer, Jacob M. .
PSYCHOLOGICAL METHODS, 2019, 24 (05) :557-570
[50]   VOTE-COUNTING METHODS IN RESEARCH SYNTHESIS [J].
HEDGES, LV ;
OLKIN, I .
PSYCHOLOGICAL BULLETIN, 1980, 88 (02) :359-369