On the effects of selection and mutation on species tree inference

被引:2
作者
Wascher, Matthew [1 ,2 ]
Kubatko, Laura S. [3 ,4 ]
机构
[1] Univ Dayton, Dept Math, Dayton, OH 45469 USA
[2] Ohio State Univ, Coll Publ Hlth, Div Epidemiol, Columbus, OH 43210 USA
[3] Ohio State Univ, Dept Stat, Columbus, OH USA
[4] Ohio State Univ, Dept Evolut Ecol & Organismal Biol, Columbus, OH USA
关键词
Phylogenetic inference; Phylogenetic tree; Selection; Mutation; Species tree inference; MAXIMUM-LIKELIHOOD; COALESCENT; GENEALOGIES; RECOMBINATION; HITCHHIKING; MODEL;
D O I
10.1016/j.ympev.2022.107650
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The effect of selection acting on regions of the genome on the accuracy of species-level phylogenetic inference using methods that do not explicitly model selection is an open question that is relevant to most, if not all, phylogenomic studies. To address this, we derive a mathematical approximation to the Wright-Fisher model with mutation and selection in the limit as the population size becomes large. In contrast to previous approximations based on diffusion processes, our approximation can be used to study the distribution of coalescent times for an arbitrary number of lineages, allowing calculation of the probability distribution of gene genealogies under the coalescent model. We use these calculations to show that direct selection at strengths typically encountered in practice has only a small effect on the distribution of coalescent times, and hence on the distribution of gene trees. This implies that many coalescent-based methods for estimating the species tree topology will be robust to the presence of selection in a subset of the underlying genes. Selection will, however, bias the estimation of speciation times, causing them to underestimate the true speciation times. Our model captures the effects of selection on the genealogies that generate the observed sequence data, but does not model selective pressures that act only on the subsequent sequences or that negatively impact gene tree estimation.
引用
收藏
页数:21
相关论文
共 49 条
  • [1] Assessing the Impacts of Positive Selection on Coalescent-Based Species Tree Estimation and Species Delimitation
    Adams, Richard H.
    Schield, Drew R.
    Card, Daren C.
    Castoe, Todd A.
    [J]. SYSTEMATIC BIOLOGY, 2018, 67 (06) : 1076 - 1090
  • [2] The effect of hitch-hiking on neutral genealogies
    Barton, NH
    [J]. GENETICS RESEARCH, 1998, 72 (02) : 123 - 133
  • [3] Coalescence in a random background
    Barton, NH
    Etheridge, AM
    Sturm, AK
    [J]. ANNALS OF APPLIED PROBABILITY, 2004, 14 (02) : 754 - 785
  • [4] The effect of selection on genealogies
    Barton, NH
    Etheridge, AM
    [J]. GENETICS, 2004, 166 (02) : 1115 - 1131
  • [5] Barton Nick., 2019, Mathematical Models in Population Genetics, P115
  • [6] Consistency and identifiability of the polymorphism-aware phylogenetic models
    Borges, Rui
    Kosiol, Carolin
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2020, 486
  • [7] Evidence for an ancient adaptive episode of convergent molecular evolution
    Castoe, Todd A.
    de Koning, A. P. Jason
    Kim, Hyun-Min
    Gu, Wanjun
    Noonan, Brice P.
    Naylor, Gavin
    Jiang, Zhi J.
    Parkinson, Christopher L.
    Pollock, David D.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (22) : 8986 - 8991
  • [8] Quartet Inference from SNP Data Under the Coalescent Model
    Chifman, Julia
    Kubatko, Laura
    [J]. BIOINFORMATICS, 2014, 30 (23) : 3317 - 3324
  • [9] Natural Selection Constrains Neutral Diversity across A Wide Range of Species
    Corbett-Detig, Russell B.
    Hartl, Daniel L.
    Sackton, Timothy B.
    [J]. PLOS BIOLOGY, 2015, 13 (04)
  • [10] PoMo: An Allele Frequency-Based Approach for Species Tree Estimation
    De Maio, Nicola
    Schrempf, Dominik
    Kosiol, Carolin
    [J]. SYSTEMATIC BIOLOGY, 2015, 64 (06) : 1018 - 1031