Robust Demographic Inference from Genomic and SNP Data

被引:985
作者
Excoffier, Laurent [1 ,2 ]
Dupanloup, Isabelle [1 ,2 ]
Huerta-Sanchez, Emilia [3 ]
Sousa, Vitor C. [1 ,2 ]
Foll, Matthieu [1 ,2 ,4 ]
机构
[1] CMPG, Inst Ecol & Evolut, Bern, Switzerland
[2] Swiss Inst Bioinformat, Lausanne, Switzerland
[3] Univ Calif Berkeley, Ctr Theoret Evolutionary Genom, Dept Integrat Biol, Berkeley, CA 94720 USA
[4] Ecole Polytech Fed Lausanne, Sch Life Sci, Lausanne, Switzerland
来源
PLOS GENETICS | 2013年 / 9卷 / 10期
基金
瑞士国家科学基金会;
关键词
APPROXIMATE BAYESIAN COMPUTATION; MAXIMUM-LIKELIHOOD-ESTIMATION; ALLELE FREQUENCY-SPECTRUM; CHAIN MONTE-CARLO; UNSAMPLED POPULATIONS; REVEALS ADAPTATION; MIGRATION RATES; MODEL SELECTION; PARAMETERS; HISTORY;
D O I
10.1371/journal.pgen.1003905
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with partial derivative a partial derivative i, the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.
引用
收藏
页数:17
相关论文
共 91 条
  • [51] Bayesian Computation and Model Selection Without Likelihoods
    Leuenberger, Christoph
    Wegmann, Daniel
    [J]. GENETICS, 2010, 184 (01) : 243 - 252
  • [52] Inference of human population history from individual whole-genome sequences
    Li, Heng
    Durbin, Richard
    [J]. NATURE, 2011, 475 (7357) : 493 - U84
  • [53] Estimating demographic parameters from large-scale population genomic data using Approximate Bayesian Computation
    Li, Sen
    Jakobsson, Mattias
    [J]. BMC GENETICS, 2012, 13
  • [54] ABC: A useful Bayesian tool for the analysis of population data
    Lopes, J. S.
    Beaumont, M. A.
    [J]. INFECTION GENETICS AND EVOLUTION, 2010, 10 (06) : 826 - 833
  • [55] Demographic Inference Using Spectral Methods on SNP Data, with an Analysis of the Human Out-of-Africa Expansion
    Lukic, Sergio
    Hey, Jody
    [J]. GENETICS, 2012, 192 (02) : 619 - +
  • [56] Non-equilibrium allele frequency spectra via spectral methods
    Lukic, Sergio
    Hey, Jody
    Chen, Kevin
    [J]. THEORETICAL POPULATION BIOLOGY, 2011, 79 (04) : 203 - 219
  • [57] Estimation of Allele Frequencies From High-Coverage Genome-Sequencing Projects
    Lynch, Michael
    [J]. GENETICS, 2009, 182 (01) : 295 - 301
  • [58] The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations
    Marth, GT
    Czabarka, E
    Murvai, J
    Sherry, ST
    [J]. GENETICS, 2004, 166 (01) : 351 - 372
  • [59] MAXIMUM-LIKELIHOOD-ESTIMATION VIA THE ECM ALGORITHM - A GENERAL FRAMEWORK
    MENG, XL
    RUBIN, DB
    [J]. BIOMETRIKA, 1993, 80 (02) : 267 - 278
  • [60] A High-Coverage Genome Sequence from an Archaic Denisovan Individual
    Meyer, Matthias
    Kircher, Martin
    Gansauge, Marie-Theres
    Li, Heng
    Racimo, Fernando
    Mallick, Swapan
    Schraiber, Joshua G.
    Jay, Flora
    Pruefer, Kay
    de Filippo, Cesare
    Sudmant, Peter H.
    Alkan, Can
    Fu, Qiaomei
    Do, Ron
    Rohland, Nadin
    Tandon, Arti
    Siebauer, Michael
    Green, Richard E.
    Bryc, Katarzyna
    Briggs, Adrian W.
    Stenzel, Udo
    Dabney, Jesse
    Shendure, Jay
    Kitzman, Jacob
    Hammer, Michael F.
    Shunkov, Michael V.
    Derevianko, Anatoli P.
    Patterson, Nick
    Andres, Aida M.
    Eichler, Evan E.
    Slatkin, Montgomery
    Reich, David
    Kelso, Janet
    Paeaebo, Svante
    [J]. SCIENCE, 2012, 338 (6104) : 222 - 226