Species Delimitation using Genome-Wide SNP Data

被引:346
|
作者
Leache, Adam D. [1 ,2 ]
Fujita, Matthew K. [3 ]
Minin, Vladimir N. [1 ,4 ]
Bouckaert, Remco R. [5 ]
机构
[1] Univ Washington, Dept Biol, Seattle, WA 98195 USA
[2] Univ Washington, Burke Museum Nat Hist & Culture, Seattle, WA 98195 USA
[3] Univ Texas Arlington, Dept Biol, Arlington, TX 76019 USA
[4] Univ Washington, Dept Stat, Seattle, WA 98195 USA
[5] Univ Auckland, Computat Evolut Grp, Auckland 1, New Zealand
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
ULTRACONSERVED ELEMENTS; GENE TREES; INFERENCE; PHYLOGENY; THOUSANDS; ACCURACY;
D O I
10.1093/sysbio/syu018
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The multispecies coalescent has provided important progress for evolutionary inferences, including increasing the statistical rigor and objectivity of comparisons among competing species delimitation models. However, Bayesian species delimitation methods typically require brute force integration over gene trees via Markov chain Monte Carlo (MCMC), which introduces a large computation burden and precludes their application to genomic-scale data. Here we combine a recently introduced dynamic programming algorithm for estimating species trees that bypasses MCMC integration over gene trees with sophisticated methods for estimating marginal likelihoods, needed for Bayesian model selection, to provide a rigorous and computationally tractable technique for genome-wide species delimitation. We provide a critical yet simple correction that brings the likelihoods of different species trees, and more importantly their corresponding marginal likelihoods, to the same common denominator, which enables direct and accurate comparisons of competing species delimitation models using Bayes factors. We test this approach, which we call Bayes factor delimitation (*with genomic data; *), using common species delimitation scenarios with computer simulations. Varying the numbers of loci and the number of samples suggest that the approach can distinguish the true model even with few loci and limited samples per species. Misspecification of the prior for population size theta has little impact on support for the true model. We apply the approach to West African forest geckos (Hemidactylus fasciatus complex) using genome-wide SNP data. This new Bayesian method for species delimitation builds on a growing trend for objective species delimitation methods with explicit model assumptions that are easily tested. [Bayes factor; model testing; phylogeography; RADseq; simulation; speciation.].
引用
收藏
页码:534 / 542
页数:9
相关论文
共 50 条
  • [1] A New Assessment of Robust Capuchin Monkey (Sapajus) Evolutionary History Using Genome-Wide SNP Marker Data and a Bayesian Approach to Species Delimitation
    Martins, Amely Branquinho
    Valenca-Montenegro, Monica Mafra
    Lima, Marcela Guimaraes Moreira
    Lynch, Jessica W.
    Svoboda, Walfrido Kuhl
    de Sousa e Silva-Junior, Jose
    Rohe, Fabio
    Boubli, Jean Philippe
    Di Fiore, Anthony
    GENES, 2023, 14 (05)
  • [2] Clustering by genetic ancestry using genome-wide SNP data
    Solovieff, Nadia
    Hartley, Stephen W.
    Baldwin, Clinton T.
    Perls, Thomas T.
    Steinberg, Martin H.
    Sebastiani, Paola
    BMC GENETICS, 2010, 11
  • [3] Clustering by genetic ancestry using genome-wide SNP data
    Nadia Solovieff
    Stephen W Hartley
    Clinton T Baldwin
    Thomas T Perls
    Martin H Steinberg
    Paola Sebastiani
    BMC Genetics, 11
  • [4] Simultaneous analysis of genome-wide SNP data
    Hoggart, C. J.
    De Iorio, M.
    Whittaker, J. C.
    Balding, D. J.
    GENETIC EPIDEMIOLOGY, 2007, 31 (06) : 609 - 609
  • [5] Genome-wide SNP Data Reveal an Overestimation of Species Diversity in a Group of Hawkmoths
    Hundsdoerfer, Anna K.
    Lee, Kyung Min
    Kitching, Ian J.
    Mutanen, Marko
    GENOME BIOLOGY AND EVOLUTION, 2019, 11 (08): : 2136 - 2150
  • [6] Detection of selective sweeps in cattle using genome-wide SNP data
    Holly R Ramey
    Jared E Decker
    Stephanie D McKay
    Megan M Rolf
    Robert D Schnabel
    Jeremy F Taylor
    BMC Genomics, 14
  • [7] Detection of selective sweeps in cattle using genome-wide SNP data
    Ramey, Holly R.
    Decker, Jared E.
    McKay, Stephanie D.
    Rolf, Megan M.
    Schnabel, Robert D.
    Taylor, Jeremy F.
    BMC GENOMICS, 2013, 14
  • [8] Phylogeography and species delimitation of Cherax destructor (Decapoda: Parastacidae) using genome-wide SNPs
    Unmack, P. J.
    Young, M. J.
    Gruber, B.
    White, D.
    Kilian, A.
    Zhang, X.
    Georges, A.
    MARINE AND FRESHWATER RESEARCH, 2019, 70 (06) : 857 - 869
  • [9] SNP Selection and Classification of Genome-Wide SNP Data Using Stratified Sampling Random Forests
    Wu, Qingyao
    Ye, Yunming
    Liu, Yang
    Ng, Michael K.
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2012, 11 (03) : 216 - 227
  • [10] Genome-wide significance for dense SNP and resequencing data
    Hoggart, Clive J.
    Clark, Taane G.
    De Lorio, Maria
    Whittaker, John C.
    Balding, David J.
    GENETIC EPIDEMIOLOGY, 2008, 32 (02) : 179 - 185