Robust and scalable inference of population history froth hundreds of unphased whole genomes

被引:539
作者
Terhorst, Jonathan [1 ]
Kamm, John A. [1 ,2 ]
Song, Yun S. [1 ,2 ,3 ,4 ,5 ]
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Div Comp Sci, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
[4] Univ Penn, Dept Biol, Philadelphia, PA 19104 USA
[5] Univ Penn, Dept Math, Philadelphia, PA 19104 USA
关键词
CONDITIONAL SAMPLING DISTRIBUTION; HUMAN-EVOLUTION; CLIMATE-CHANGE; RECOMBINATION; NEANDERTHAL; SEQUENCES; INTROGRESSION; DIVERGENCE; COALESCENT; HUMANS;
D O I
10.1038/ng.3748
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
It has recently been demonstrated that inference methods based on genealogical processes with recombination can uncover past population history in unprecedented detail. However, these methods scale poorly with sample size, limiting resolution in the recent past, and they require phased genomes, which contain switch errors that can catastrophically distort the inferred history. Here we present SMC++, a new statistical tool capable of analyzing orders of magnitude more samples than existing methods while requiring only unphased genomes (its results are independent of phasing). SMC++ can jointly infer population size histories and split times in diverged populations, and it employs a novel spline regularization scheme that greatly reduces estimation error. We apply SMC++ to analyze sequence data from over a thousand human genomes in Africa and Eurasia, hundreds of genomes from a Drosophila melanogaster population in Africa, and tens of genomes from zebra finch and long-tailed finch populations in Australia.
引用
收藏
页码:303 / 309
页数:7
相关论文
共 37 条
[1]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[2]   Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data [J].
Bhaskar, Anand ;
Wang, Y. X. Rachel ;
Song, Yun S. .
GENOME RESEARCH, 2015, 25 (02) :268-279
[3]   Haplotype phasing: existing methods and new developments [J].
Browning, Sharon R. ;
Browning, Brian L. .
NATURE REVIEWS GENETICS, 2011, 12 (10) :703-714
[4]   Improved whole-chromosome phasing for disease and population genetic studies [J].
Delaneau, Olivier ;
Zagury, Jean-Francois ;
Marchini, Jonathan .
NATURE METHODS, 2013, 10 (01) :5-6
[5]   Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays [J].
Drmanac, Radoje ;
Sparks, Andrew B. ;
Callow, Matthew J. ;
Halpern, Aaron L. ;
Burns, Norman L. ;
Kermani, Bahram G. ;
Carnevali, Paolo ;
Nazarenko, Igor ;
Nilsen, Geoffrey B. ;
Yeung, George ;
Dahl, Fredrik ;
Fernandez, Andres ;
Staker, Bryan ;
Pant, Krishna P. ;
Baccash, Jonathan ;
Borcherding, Adam P. ;
Brownley, Anushka ;
Cedeno, Ryan ;
Chen, Linsu ;
Chernikoff, Dan ;
Cheung, Alex ;
Chirita, Razvan ;
Curson, Benjamin ;
Ebert, Jessica C. ;
Hacker, Coleen R. ;
Hartlage, Robert ;
Hauser, Brian ;
Huang, Steve ;
Jiang, Yuan ;
Karpinchyk, Vitali ;
Koenig, Mark ;
Kong, Calvin ;
Landers, Tom ;
Le, Catherine ;
Liu, Jia ;
McBride, Celeste E. ;
Morenzoni, Matt ;
Morey, Robert E. ;
Mutch, Karl ;
Perazich, Helena ;
Perry, Kimberly ;
Peters, Brock A. ;
Peterson, Joe ;
Pethiyagoda, Charit L. ;
Pothuraju, Kaliprasad ;
Richter, Claudia ;
Rosenbaum, Abraham M. ;
Roy, Shaunak ;
Shafto, Jay ;
Sharanhovich, Uladzislau .
SCIENCE, 2010, 327 (5961) :78-81
[6]   Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach [J].
Dutheil, Julien Y. ;
Ganapathy, Ganesh ;
Hobolth, Asger ;
Mailund, Thomas ;
Uyenoyama, Marcy K. ;
Schierup, Mikkel H. .
GENETICS, 2009, 183 (01) :259-274
[7]   Robust Demographic Inference from Genomic and SNP Data [J].
Excoffier, Laurent ;
Dupanloup, Isabelle ;
Huerta-Sanchez, Emilia ;
Sousa, Vitor C. ;
Foll, Matthieu .
PLOS GENETICS, 2013, 9 (10)
[8]   Genome sequence of a 45,000-year-old modern human from western Siberia [J].
Fu, Qiaomei ;
Li, Heng ;
Moorjani, Priya ;
Jay, Flora ;
Slepchenko, Sergey M. ;
Bondarev, Aleksei A. ;
Johnson, Philip L. F. ;
Aximu-Petri, Ayinuer ;
Pruefer, Kay ;
de Filippo, Cesare ;
Meyer, Matthias ;
Zwyns, Nicolas ;
Salazar-Garcia, Domingo C. ;
Kuzmin, Yaroslav V. ;
Keates, Susan G. ;
Kosintsev, Pavel A. ;
Razhev, Dmitry I. ;
Richards, Michael P. ;
Peristov, Nikolai V. ;
Lachmann, Michael ;
Douka, Katerina ;
Higham, Thomas F. G. ;
Slatkin, Montgomery ;
Hublin, Jean-Jacques ;
Reich, David ;
Kelso, Janet ;
Viola, T. Bence ;
Paeaebo, Svante .
NATURE, 2014, 514 (7523) :445-+
[9]   A Draft Sequence of the Neandertal Genome [J].
Green, Richard E. ;
Krause, Johannes ;
Briggs, Adrian W. ;
Maricic, Tomislav ;
Stenzel, Udo ;
Kircher, Martin ;
Patterson, Nick ;
Li, Heng ;
Zhai, Weiwei ;
Fritz, Markus Hsi-Yang ;
Hansen, Nancy F. ;
Durand, Eric Y. ;
Malaspinas, Anna-Sapfo ;
Jensen, Jeffrey D. ;
Marques-Bonet, Tomas ;
Alkan, Can ;
Pruefer, Kay ;
Meyer, Matthias ;
Burbano, Hernan A. ;
Good, Jeffrey M. ;
Schultz, Rigo ;
Aximu-Petri, Ayinuer ;
Butthof, Anne ;
Hoeber, Barbara ;
Hoeffner, Barbara ;
Siegemund, Madlen ;
Weihmann, Antje ;
Nusbaum, Chad ;
Lander, Eric S. ;
Russ, Carsten ;
Novod, Nathaniel ;
Affourtit, Jason ;
Egholm, Michael ;
Verna, Christine ;
Rudan, Pavao ;
Brajkovic, Dejana ;
Kucan, Zeljko ;
Gusic, Ivan ;
Doronichev, Vladimir B. ;
Golovanova, Liubov V. ;
Lalueza-Fox, Carles ;
de la Rasilla, Marco ;
Fortea, Javier ;
Rosas, Antonio ;
Schmitz, Ralf W. ;
Johnson, Philip L. F. ;
Eichler, Evan E. ;
Falush, Daniel ;
Birney, Ewan ;
Mullikin, James C. .
SCIENCE, 2010, 328 (5979) :710-722
[10]   SAMPLING THEORY FOR NEUTRAL ALLELES IN A VARYING ENVIRONMENT [J].
GRIFFITHS, RC ;
TAVARE, S .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY OF LONDON SERIES B-BIOLOGICAL SCIENCES, 1994, 344 (1310) :403-410