The Impact of Purifying and Background Selection on the Inference of Population History: Problems and Prospects

被引:67
作者
Johri, Parul [1 ]
Riall, Kellen [1 ]
Becher, Hannes [2 ]
Excoffier, Laurent [3 ,4 ]
Charlesworth, Brian [2 ]
Jensen, Jeffrey D. [1 ]
机构
[1] Arizona State Univ, Sch Life Sci, Tempe, AZ 85287 USA
[2] Univ Edinburgh, Sch Biol Sci, Inst Evolutionary Biol, Edinburgh, Midlothian, Scotland
[3] Univ Bern, Inst Ecol & Evolut, Bern, Switzerland
[4] Swiss Inst Bioinformat, Lausanne, Switzerland
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
demographic inference; background selection; distribution of fitness effects; MSMC; fastsimcoal2; approximate Bayesian computation (ABC); DELETERIOUS MUTATIONS; DEMOGRAPHIC INFERENCE; MOLECULAR EVOLUTION; CONSERVED ELEMENTS; RECOMBINATION RATE; GENOME; PATTERNS; SIZE; DNA; SEQUENCE;
D O I
10.1093/molbev/msab050
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current procedures for inferring population history generally assume complete neutrality-that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
引用
收藏
页码:2986 / 3003
页数:18
相关论文
共 99 条
[1]   A community-maintained standard library of population genetic models [J].
Adrion, Jeffrey R. ;
Cole, Christopher B. ;
Dukler, Noah ;
Galloway, Jared G. ;
Gladstein, Ariella L. ;
Gower, Graham ;
Kyriazis, Christopher C. ;
Ragsdale, Aaron P. ;
Tsambos, Georgia ;
Baumdicker, Franz ;
Carlson, Jedidiah ;
Cartwright, Reed A. ;
Durvasula, Arun ;
Gronau, Ilan ;
Kim, Bernard Y. ;
McKenzie, Patrick ;
Messer, Philipp W. ;
Noskova, Ekaterina ;
Ortega-Del Vecchyo, Diego ;
Racimo, Fernando ;
Struck, Travis J. ;
Gravel, Simon ;
Gutenkunst, Ryan N. ;
Lohmueller, Kirk E. ;
Ralph, Peter L. ;
Schrider, Daniel R. ;
Siepel, Adam ;
Kelleher, Jerome ;
Kern, Andrew D. .
ELIFE, 2020, 9 :1-39
[2]   Adaptive evolution of non-coding DNA in Drosophila [J].
Andolfatto, P .
NATURE, 2005, 437 (7062) :1149-1152
[3]   Thinking too positive? Revisiting current methods of population genetic selection inference [J].
Bank, Claudia ;
Ewing, Gregory B. ;
Ferrer-Admettla, Anna ;
Foll, Matthieu ;
Jensen, Jeffrey D. .
TRENDS IN GENETICS, 2014, 30 (12) :540-546
[4]   Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories [J].
Beichma, Annabel C. ;
Phung, Tanya N. ;
Lohmueller, Kirk E. .
G3-GENES GENOMES GENETICS, 2017, 7 (11) :3605-3620
[5]   Using Genomic Data to Infer Historic Population Dynamics of Nonmodel Organisms [J].
Beichman, Annabel C. ;
Huerta-Sanchez, Emilia ;
Lohmueller, Kirk E. .
ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS, VOL 49, 2018, 49 :433-456
[6]   Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data [J].
Bhaskar, Anand ;
Wang, Y. X. Rachel ;
Song, Yun S. .
GENOME RESEARCH, 2015, 25 (02) :268-279
[7]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[8]   Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach [J].
Boitard, Simon ;
Rodriguez, Willy ;
Jay, Flora ;
Mona, Stefano ;
Austerlitz, Frederic .
PLOS GENETICS, 2016, 12 (03)
[9]   Understanding the Factors That Shape Patterns of Nucleotide Diversity in the House Mouse Genome [J].
Booker, Tom R. ;
Keightley, Peter D. .
MOLECULAR BIOLOGY AND EVOLUTION, 2018, 35 (12) :2971-2988
[10]   Inferring Bottlenecks from Genome-Wide Samples of Short Sequence Blocks [J].
Bunnefeld, Lynsey ;
Frantz, Laurent A. F. ;
Lohse, Konrad .
GENETICS, 2015, 201 (03) :1157-U651