Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes

被引:4
作者
Flegontov, Pavel [1 ,2 ,3 ]
Isildak, Ulas [2 ,8 ]
Maier, Robert [1 ]
Yuncu, Eren [2 ,7 ]
Changmai, Piya [2 ]
Reich, David [1 ,4 ,5 ,6 ]
机构
[1] Harvard Univ, Dept Human Evolutionary Biol, Cambridge, MA 02138 USA
[2] Univ Ostrava, Dept Biol & Ecol, Fac Sci, Ostrava, Czech Republic
[3] Russian Acad Sci, Kalmyk Res Ctr, Elista, Russia
[4] Harvard Med Sch, Dept Genet, Boston, MA 02115 USA
[5] Harvard Med Sch, Howard Hughes Med Inst, Boston, MA 02115 USA
[6] Broad Inst Harvard & MIT, Cambridge, MA 02142 USA
[7] Middle East Tech U, Dept Biol Sci, Ankara, Turkiye
[8] FLI, Leibniz Inst Aging, Jena, Germany
来源
PLOS GENETICS | 2023年 / 19卷 / 09期
关键词
COVERAGE NEANDERTHAL GENOME; WIDE PATTERNS; ADMIXTURE; SEQUENCE; INFERENCE; ANCESTRY; CAVE;
D O I
10.1371/journal.pgen.1010931
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.
引用
收藏
页数:44
相关论文
共 86 条
  • [1] Ascertainment Biases in SNP Chips Affect Measures of Population Divergence
    Albrechtsen, Anders
    Nielsen, Finn Cilius
    Nielsen, Rasmus
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2010, 27 (11) : 2534 - 2547
  • [2] Efficient ancestry and mutation simulation with msprime 1.0
    Baumdicker, Franz
    Bisschop, Gertjan
    Goldstein, Daniel
    Gower, Graham
    Ragsdale, Aaron P.
    Tsambos, Georgia
    Zhu, Sha
    Eldon, Bjarki
    Ellerman, E. Castedo
    Galloway, Jared G.
    Gladstein, Ariella L.
    Gorjanc, Gregor
    Guo, Bing
    Jeffery, Ben
    Kretzschumar, Warren W.
    Lohse, Konrad
    Matschiner, Michael
    Nelson, Dominic
    Pope, Nathaniel S.
    Quinto-Cortes, Consuelo D.
    Rodrigues, Murillo F.
    Saunack, Kumar
    Sellinger, Thibaut
    Thornton, Kevin
    van Kemenade, Hugo
    Wohns, Anthony W.
    Wong, Yan
    Gravel, Simon
    Kern, Andrew D.
    Koskela, Jere
    Ralph, Peter L.
    Kelleher, Jerome
    [J]. GENETICS, 2022, 220 (03)
  • [3] Grey wolf genomic history reveals a dual ancestry of dogs
    Bergstrom, Anders
    Stanton, David W. G.
    Taron, Ulrike H.
    Frantz, Laurent
    Sinding, Mikkel-Holger S.
    Ersmark, Erik
    Pfrengle, Saskia
    Cassatt-Johnstone, Molly
    Lebrasseur, Ophelie
    Girdland-Flink, Linus
    Fernandes, Daniel M.
    Ollivier, Morgane
    Speidel, Leo
    Gopalakrishnan, Shyam
    Westbury, Michael V.
    Ramos-Madrigal, Jazmin
    Feuerborn, Tatiana R.
    Reiter, Ella
    Gretzinger, Joscha
    Muenzel, Susanne C.
    Swali, Pooja
    Conard, Nicholas J.
    Caroe, Christian
    Haile, James
    Linderholm, Anna
    Androsov, Semyon
    Barnes, Ian
    Baumann, Chris
    Benecke, Norbert
    Bocherens, Herve
    Brace, Selina
    Carden, Ruth F.
    Drucker, Dorothee G.
    Fedorov, Sergey
    Gasparik, Mihaly
    Germonpre, Mietje
    Grigoriev, Semyon
    Groves, Pam
    Hertwig, Stefan T.
    Ivanova, Varvara V.
    Janssens, Luc
    Jennings, Richard P.
    Kasparov, Aleksei K.
    Kirillova, Irina V.
    Kurmaniyazov, Islam
    Kuzmin, Yaroslav V.
    Kosintsev, Pavel A.
    Laznickova-Galetova, Martina
    Leduc, Charlotte
    Nikolskiy, Pavel
    [J]. NATURE, 2022, 607 (7918) : 313 - +
  • [4] Origins and genetic legacy of prehistoric dogs
    Bergstrom, Anders
    Frantz, Laurent
    Schmidt, Ryan
    Ersmark, Erik
    Lebrasseur, Ophelie
    Girdland-Flink, Linus
    Lin, Audrey T.
    Stora, Jan
    Sjogren, Karl-Goran
    Anthony, David
    Antipina, Ekaterina
    Amiri, Sarieh
    Bar-Oz, Guy
    Bazaliiskii, Vladimir I.
    Bulatovic, Jelena
    Brown, Dorcas
    Carmagnini, Alberto
    Davy, Tom
    Fedorov, Sergey
    Fiore, Ivana
    Fulton, Deirdre
    Germonpre, Mietje
    Haile, James
    Irving-Pease, Evan K.
    Jamieson, Alexandra
    Janssens, Luc
    Kirillova, Irina
    Horwitz, Liora Kolska
    Kuzmanovic-Cvetkovic, Julka
    Kuzmin, Yaroslav
    Losey, Robert J.
    Dizdar, Daria Loznjak
    Mashkour, Marjan
    Novak, Mario
    Onar, Vedat
    Orton, David
    Pasaric, Maja
    Radivojevic, Miljana
    Rajkovic, Dragana
    Roberts, Benjamin
    Ryan, Hannah
    Sablin, Mikhail
    Shidlovskiy, Fedor
    Stojanovic, Ivana
    Tagliacozzo, Antonio
    Trantalidou, Katerina
    Ullen, Inga
    Villaluenga, Aritza
    Wapnish, Paula
    Dobney, Keith
    [J]. SCIENCE, 2020, 370 (6516) : 557 - 563
  • [5] Insights into human genetic variation and population history from 929 diverse genomes
    Bergstrom, Anders
    McCarthy, Shane A.
    Hui, Ruoyun
    Almarri, Mohamed A.
    Ayub, Qasim
    Danecek, Petr
    Chen, Yuan
    Felkel, Sabine
    Hallast, Pille
    Kamm, Jack
    Blanche, Helene
    Deleuze, Jean-Francois
    Cann, Howard
    Mallick, Swapan
    Reich, David
    Sandhu, Manjinder S.
    Skoglund, Pontus
    Scally, Aylwyn
    Xue, Yali
    Durbin, Richard
    Tyler-Smith, Chris
    [J]. SCIENCE, 2020, 367 (6484) : 1339 - +
  • [6] Entwined African and Asian genetic roots of medieval peoples of the Swahili coast
    Brielle, Esther S.
    Fleisher, Jeffrey
    Wynne-Jones, Stephanie
    Sirak, Kendra
    Broomandkhoshbacht, Nasreen
    Callan, Kim
    Curtis, Elizabeth
    Iliev, Lora
    Lawson, Ann Marie
    Oppenheimer, Jonas
    Qiu, Lijun
    Stewardson, Kristin
    Workman, J. Noah
    Zalzala, Fatma
    Ayodo, George
    Gidna, Agness O.
    Kabiru, Angela
    Kwekason, Amandus
    Mabulla, Audax Z. P.
    Manthi, Fredrick K.
    Ndiema, Emmanuel
    Ogola, Christine
    Sawchuk, Elizabeth
    Al-Gazali, Lihadh
    Ali, Bassam R.
    Ben-Salem, Salma
    Letellier, Thierry
    Pierron, Denis
    Radimilahy, Chantal
    Rakotoarisoa, Jean-Aime
    Raaum, Ryan L.
    Culleton, Brendan J.
    Mallick, Swapan
    Rohland, Nadin
    Patterson, Nick
    Mwenje, Mohammed Ali
    Ahmed, Khalfan Bini
    Mohamed, Mohamed Mchulla
    Williams, Sloan R.
    Monge, Janet
    Kusimba, Sibel
    Prendergast, Mary E.
    Reich, David
    Kusimba, Chapurukha M.
    [J]. NATURE, 2023, 615 (7954) : 866 - +
  • [7] Second-generation PLINK: rising to the challenge of larger and richer datasets
    Chang, Christopher C.
    Chow, Carson C.
    Tellier, Laurent C. A. M.
    Vattikuti, Shashaank
    Purcell, Shaun M.
    Lee, James J.
    [J]. GIGASCIENCE, 2015, 4
  • [8] Indian genetic heritage in Southeast Asian populations
    Changmai, Piya
    Jaisamut, Kitipong
    Kampuansai, Jatupol
    Kutanan, Wibhu
    Altinisik, N. Ezgi S.
    Flegontova, Olga
    Inta, Angkhana S.
    Yuencue, Eren
    Boonthai, Worrawit S.
    Pamjav, Horolma
    Reich, David S.
    Flegontov, Pavel
    [J]. PLOS GENETICS, 2022, 18 (02):
  • [9] Identifying and Interpreting Apparent Neanderthal Ancestry in African Individuals
    Chen, Lu
    Wolf, Aaron B.
    Fu, Wenqing
    Li, Liming
    Akey, Joshua M.
    [J]. CELL, 2020, 180 (04) : 677 - +
  • [10] Chen Z, 2019, NAT COMMUN, V10, DOI [10.1038/s41467-018-08004-0, 10.1038/s41467-018-08220-8]