Site-Specific Amino Acid Distributions Follow a Universal Shape

被引:3
作者
Johnson, Mackenzie M. [1 ,2 ]
Wilke, Claus O. [1 ]
机构
[1] Univ Texas Austin, Dept Integrat Biol, Austin, TX 78712 USA
[2] Univ Texas Austin, Inst Cellular & Mol Biol, Austin, TX 78712 USA
基金
美国国家卫生研究院;
关键词
Amino-acid distributions; Protein site variability; Evolutionary modeling; MUTATION-SELECTION MODELS; PROTEIN EVOLUTION; SEQUENCE EVOLUTION; SUBSTITUTION; COEFFICIENTS; STABILITY; RATES;
D O I
10.1007/s00239-020-09976-8
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In many applications of evolutionary inference, a model of protein evolution needs to be fitted to the amino acid variation at individual sites in a multiple sequence alignment. Most existing models fall into one of two extremes: Either they provide a coarse-grained description that lacks biophysical realism (e.g., dN/dS models), or they require a large number of parameters to be fitted (e.g., mutation-selection models). Here, we ask whether a middle ground is possible: Can we obtain a realistic description of site-specific amino acid frequencies while severely restricting the number of free parameters in the model? We show that a distribution with a single free parameter can accurately capture the variation in amino acid frequency at most sites in an alignment, as long as we are willing to restrict our analysis to predicting amino acid frequencies by rank rather than by amino acid identity. This result holds equally well both in alignments of empirical protein sequences and of sequences evolved under a biophysically realistic all-atom force field. Our analysis reveals a near universal shape of the frequency distributions of amino acids. This insight has the potential to lead to new models of evolution that have both increased realism and a limited number of free parameters.
引用
收藏
页码:731 / 741
页数:11
相关论文
共 47 条
  • [1] Trends in substitution models of molecular evolution
    Arenas, Miguel
    [J]. FRONTIERS IN GENETICS, 2015, 6
  • [2] Maximum-Likelihood Phylogenetic Inference with Selection on Protein Folding Stability
    Arenas, Miguel
    Sanchez-Cobos, Agustin
    Bastolla, Ugo
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2015, 32 (08) : 2195 - 2207
  • [3] Simulation of Genome-Wide Evolution under Heterogeneous Substitution Models and Complex Multispecies Coalescent Histories
    Arenas, Miguel
    Posada, David
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (05) : 1295 - 1301
  • [4] ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules
    Ashkenazy, Haim
    Abadi, Shiran
    Martz, Eric
    Chay, Ofer
    Mayrose, Itay
    Pupko, Tal
    Ben-Tal, Nir
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (W1) : W344 - W350
  • [5] Bastolla U, 2019, METHODS MOL BIOL, V1851, P215, DOI 10.1007/978-1-4939-8736-8_11
  • [6] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [7] Modeling residue usage in aligned protein sequences via maximum likelihood
    Bruno, WJ
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (10) : 1368 - 1374
  • [8] Solvent Exposure Imparts Similar Selective Pressures across a Range of Yeast Proteins
    Conant, Gavin C.
    Stadler, Peter F.
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2009, 26 (05) : 1155 - 1161
  • [9] Understanding conserved amino acids in proteins
    Dokholyan, NV
    Mirny, LA
    Shakhnovich, EI
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2002, 314 (1-4) : 600 - 606
  • [10] Understanding hierarchical protein evolution from first principles
    Dokholyan, NV
    Shakhnovich, EI
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 312 (01) : 289 - 307