Inference of Population Structure from Time-Series Genotype Data

被引:19
作者
Joseph, Tyler A. [1 ]
Pe'er, Itsik [1 ,2 ,3 ]
机构
[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
[2] Columbia Univ, Dept Syst Biol, New York, NY 10027 USA
[3] Columbia Univ, Data Sci Inst, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
VARIATIONAL INFERENCE; EARLY FARMERS; ADMIXTURE; DIVERSITY; ANCESTRY; SEQUENCE; PATTERNS; ORIGIN; TOOL; SNP;
D O I
10.1016/j.ajhg.2019.06.002
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Sequencing ancient DNA can offer direct probing of population history. Yet, such data are commonly analyzed with standard tools that assume DNA samples are all contemporary. We present DyStruct, a model and inference algorithm for inferring shared ancestry from temporally sampled genotype data. DyStruct explicitly incorporates temporal dynamics by modeling individuals as mixtures of unobserved populations whose allele frequencies drift over time. We develop an efficient inference algorithm for our model using stochastic variational inference. On simulated data, we show that DyStruct outperforms the current state of the art when individuals are sampled over time. Using a dataset of 296 modern and 80 ancient samples, we demonstrate DyStruct is able to capture a well-supported admixture event of steppe ancestry into modern Europe. We further apply DyStruct to a genome-wide dataset of 2,067 modern and 262 ancient samples used to study the origin of farming in the Near East. We show that DyStruct provides new insight into population history when compared with alternate approaches, within feasible run time.
引用
收藏
页码:317 / 333
页数:17
相关论文
共 42 条
[1]   Fast model-based estimation of ancestry in unrelated individuals [J].
Alexander, David H. ;
Novembre, John ;
Lange, Kenneth .
GENOME RESEARCH, 2009, 19 (09) :1655-1664
[2]   Population genomics of Bronze Age Eurasia [J].
Allentoft, Morten E. ;
Sikora, Martin ;
Sjogren, Karl-Goran ;
Rasmussen, Simon ;
Rasmussen, Morten ;
Stenderup, Jesper ;
Damgaard, Peter B. ;
Schroeder, Hannes ;
Ahlstrom, Torbjorn ;
Vinner, Lasse ;
Malaspinas, Anna-Sapfo ;
Margaryan, Ashot ;
Higham, Tom ;
Chivall, David ;
Lynnerup, Niels ;
Harvig, Lise ;
Baron, Justyna ;
Della Casa, Philippe ;
Dabrowski, Pawel ;
Duffy, Paul R. ;
Ebel, Alexander V. ;
Epimakhov, Andrey ;
Frei, Karin ;
Furmanek, Miroslaw ;
Gralak, Tomasz ;
Gromov, Andrey ;
Gronkiewicz, Stanislaw ;
Grupe, Gisela ;
Hajdu, Tamas ;
Jarysz, Radoslaw ;
Khartanovich, Valeri ;
Khokhlov, Alexandr ;
Kiss, Viktoria ;
Kolar, Jan ;
Kriiska, Aivar ;
Lasak, Irena ;
Longhi, Cristina ;
McGlynn, George ;
Merkevicius, Algimantas ;
Merkyte, Inga ;
Metspalu, Mait ;
Mkrtchyan, Ruzan ;
Moiseyev, Vyacheslav ;
Paja, Laszlo ;
Palfi, Gyoergy ;
Pokutta, Dalia ;
Pospieszny, Lukasz ;
Price, T. Douglas ;
Saag, Lehti ;
Sablin, Mikhail .
NATURE, 2015, 522 (7555) :167-+
[3]  
[Anonymous], 2018, Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past
[4]  
Blei D.M., 2006, P 23 INT C MACH LEAR, DOI DOI 10.1145/1143844.1143859
[5]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[6]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]  
Callaway E, 2018, NATURE, V555, P573, DOI 10.1038/d41586-018-03773-6
[9]   Effective population size and patterns of molecular evolution and variation [J].
Charlesworth, Brian .
NATURE REVIEWS GENETICS, 2009, 10 (03) :195-205
[10]  
Cheng JY, 2017, BIOINFORMATICS, V33, P2148, DOI [10.1093/bioinformatics/btx711, 10.1093/bioinformatics/btx098]