Recommendations for utilizing and reporting population genetic analyses: the reproducibility of genetic clustering using the program STRUCTURE

被引:226
作者
Gilbert, Kimberly J. [1 ,2 ]
Andrew, Rose L. [1 ,3 ]
Bock, Dan G. [1 ,3 ]
Franklin, Michelle T. [1 ,4 ]
Kane, Nolan C. [1 ,3 ,5 ]
Moore, Jean-Sebastien [1 ,2 ]
Moyers, Brook T. [1 ,3 ]
Renaut, Sebastien [1 ,3 ]
Rennison, Diana J. [1 ,2 ]
Veen, Thor [1 ]
Vines, Timothy H. [1 ,6 ]
机构
[1] Univ British Columbia, Biodivers Res Ctr, Vancouver, BC V6T 1Z4, Canada
[2] Univ British Columbia, Dept Zool, Vancouver, BC V6T 1Z4, Canada
[3] Univ British Columbia, Dept Bot, Vancouver, BC V6T 1Z4, Canada
[4] Simon Fraser Univ, Dept Biol Sci, Burnaby, BC V5A 1S6, Canada
[5] Univ Colorado, Dept Ecol & Evolutionary Biol, Boulder, CO 80309 USA
[6] Mol Ecol Editorial Off, Vancouver, BC V6T 1Z4, Canada
关键词
population clustering; population genetics; reproducibility; STRUCTURE; MULTILOCUS GENOTYPE DATA; INFERENCE; IDENTIFICATION; SOFTWARE;
D O I
10.1111/j.1365-294X.2012.05754.x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Reproducibility is the benchmark for results and conclusions drawn from scientific studies, but systematic studies on the reproducibility of scientific results are surprisingly rare. Moreover, many modern statistical methods make use of 'random walk' model fitting procedures, and these are inherently stochastic in their output. Does the combination of these statistical procedures and current standards of data archiving and method reporting permit the reproduction of the authors' results? To test this, we reanalysed data sets gathered from papers using the software package STRUCTURE to identify genetically similar clusters of individuals. We find that reproducing STRUCTURE results can be difficult despite the straightforward requirements of the program. Our results indicate that 30% of analyses were unable to reproduce the same number of population clusters. To improve this, we make recommendations for future use of the software and for reporting STRUCTURE analyses and results in published works.
引用
收藏
页码:4925 / 4930
页数:6
相关论文
共 25 条
[1]  
[Anonymous], 2011, SCI CAREERS, DOI DOI 10.1126/SCIENCE.CAREDIT.A1100133
[2]  
[Anonymous], DRYAD DIG REP
[3]   Bayesian clustering algorithms ascertaining spatial population structure:: a new computer program and a comparison study [J].
Chen, Chibiao ;
Durand, Eric ;
Forbes, Florence ;
Francois, Olivier .
MOLECULAR ECOLOGY NOTES, 2007, 7 (05) :747-756
[4]   BAPS 2:: enhanced possibilities for the analysis of genetic population structure [J].
Corander, J ;
Waldmann, P ;
Marttinen, P ;
Sillanpää, MJ .
BIOINFORMATICS, 2004, 20 (15) :2363-2369
[5]  
Corander J, 2003, GENETICS, V163, P367
[6]   Bayesian identification of admixture events using multilocus molecular markers [J].
Corander, Jukka ;
Marttinen, Pekka .
MOLECULAR ECOLOGY, 2006, 15 (10) :2833-2843
[7]  
Corander J, 2006, FISH B-NOAA, V104, P550
[8]   STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method [J].
Earl, Dent A. ;
vonHoldt, Bridgett M. .
CONSERVATION GENETICS RESOURCES, 2012, 4 (02) :359-361
[9]   Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study [J].
Evanno, G ;
Regnaut, S ;
Goudet, J .
MOLECULAR ECOLOGY, 2005, 14 (08) :2611-2620
[10]  
Falush D, 2003, GENETICS, V164, P1567