Efficient inference, potential, and limitations of site-specific substitution models

被引:2
|
作者
Puller, Vadim [1 ,2 ]
Sagulenko, Pavel [3 ]
Neher, Richard A. [1 ,2 ]
机构
[1] Univ Basel, Biozentrum, Klingelbergstr 50-70, CH-4056 Basel, Switzerland
[2] SIB Swiss Inst Bioinformat, Klingelbergstr 61, Basel, Switzerland
[3] Max Planck Inst Dev Biol, Max Planck Ring 5, D-72076 Tubingen, Germany
基金
欧洲研究理事会;
关键词
phylogenetics; fitness landscapes; algorithms; PHYLOGENETIC ANALYSIS; MIXTURE MODEL; SEQUENCES; SELECTION; MUTATION; EVOLUTION;
D O I
10.1093/ve/veaa066
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Natural selection imposes a complex filter on which variants persist in a population resulting in evolutionary patterns that vary greatly along the genome. Some sites evolve close to neutrally, while others are highly conserved, allow only specific states, or only change in concert with other sites. On one hand, such constraints on sequence evolution can be to infer biological function, one the other hand they need to be accounted for in phylogenetic reconstruction. Phylogenetic models often account for this complexity by partitioning sites into a small number of discrete classes with different rates and/or state preferences. Appropriate model complexity is typically determined by model selection procedures. Here, we present an efficient algorithm to estimate more complex models that allow for different preferences at every site and explore the accuracy at which such models can be estimated from simulated data. Our iterative approximate maximumlikelihood scheme uses information in the data efficiently and accurately estimates site-specific preferences from large data sets with moderately diverged sequences and known topology. However, the joint estimation of site-specific rates, and site-specific preferences, and phylogenetic branch length can suffer from identifiability problems, while ignoring variation in preferences across sites results in branch length underestimates. Site-specific preferences estimated from large HIV pol alignments show qualitative concordance with intra-host estimates of fitness costs. Analysis of these substitution models suggests near saturation of divergence after a few hundred years. Such saturation can explain the inability to infer deep divergence times of HIV and SIVs using molecular clock approaches and time-dependent rate estimates.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Site-specific time heterogeneity of the substitution process and its impact on phylogenetic inference
    Béatrice Roure
    Hervé Philippe
    BMC Evolutionary Biology, 11
  • [2] Site-specific time heterogeneity of the substitution process and its impact on phylogenetic inference
    Roure, Beatrice
    Philippe, Herve
    BMC EVOLUTIONARY BIOLOGY, 2011, 11
  • [3] The potential of site-specific and local chironomid-based inference models for reconstructing past lake levels
    Kurek, Joshua
    Cwynar, Les C.
    JOURNAL OF PALEOLIMNOLOGY, 2009, 42 (01) : 37 - 50
  • [4] The potential of site-specific and local chironomid-based inference models for reconstructing past lake levels
    Joshua Kurek
    Les C. Cwynar
    Journal of Paleolimnology, 2009, 42 : 37 - 50
  • [5] Identifying site-specific substitution rates
    Meyer, S
    von Haeseler, A
    MOLECULAR BIOLOGY AND EVOLUTION, 2003, 20 (02) : 182 - 189
  • [6] Potential and limitations of spectral reflectance measurements for the estimation of the site-specific variability in crops
    Erasmi, S
    Dobers, ES
    REMOTE SENSING FOR AGRICULTURE, ECOSYSTEMS, AND HYDROLOGY V, 2004, 5232 : 42 - 51
  • [7] On the Statistical Interpretation of Site-Specific Variables in Phylogeny-Based Substitution Models
    Rodrigue, Nicolas
    GENETICS, 2013, 193 (02) : 557 - 564
  • [8] Site-specific isotope effects and origin inference
    Martin, ML
    Martin, GJ
    ANALUSIS, 1999, 27 (03) : 209 - 213
  • [9] RAPID AND EFFICIENT SITE-SPECIFIC MUTAGENESIS
    BALDRICH, M
    GOEBEL, W
    PROTEIN ENGINEERING, 1990, 3 (06): : 563 - 563
  • [10] Equiprobable discrete models of site-specific substitution rates underestimate the extent of rate variability
    Mannino, Frank
    Wisotsky, Sadie
    Pond, Sergei L. Kosakovsky
    Muse, Spencer, V
    PLOS ONE, 2020, 15 (03):