nT4X and nT4M: Novel Time Non-reversible Mixture Amino Acid Substitution Models

被引:0
作者
Tinh, Nguyen Huy [1 ]
Dang, Cuong Cao [1 ]
Vinh, Le Sy [1 ]
机构
[1] Vietnam Natl Univ, Univ Engn & Technol, 144 Xuan Thuy, Hanoi 10000, Vietnam
关键词
Amino acid substitution models; Maximum likelihood estimation method; Mixture models; Time non-reversible models; PROTEIN EVOLUTION; HSSP DATABASE; PERFORMANCE; PHYLOGENY; SELECTION;
D O I
10.1007/s00239-024-10230-8
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
One of the most important and difficult challenges in the research of molecular evolution is modeling the process of amino acid substitutions. Although single-matrix models, such as the LG model, are popular, their capability to properly capture the heterogeneity of the substitution process across sites is still questioned. Several mixture models with multiple matrices have been introduced and shown to offer advantages over single-matrix models. Current general mixture models assume the reversibility of the evolutionary process, implying that substitution rates between any two amino acids are equal in both forward and backward directions. This assumption is not based on biological properties but rather on computational simplicity. The well-known hypothesis is that more realistic models can yield more accurate evolutionary inferences; therefore, our aim is to estimate more biologically realistic models. To this end, we relax the assumption of reversibility and introduce two new general non-reversible 4-matrix mixture models, called nT4M and nT4X. Using alignments from HSSP and TreeBASE databases as data, our newly estimated models outperformed all single-matrix models and almost all reversible mixture models. Moreover, the new non-reversible mixture models enable us to infer rooted trees.
引用
收藏
页码:136 / 148
页数:13
相关论文
共 56 条
  • [1] Model selection may not be a mandatory step for phylogeny reconstruction
    Abadi, Shiran
    Azouri, Dana
    Pupko, Tal
    Mayrose, Itay
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [2] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION
    AKAIKE, H
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) : 716 - 723
  • [3] Protein evolution along phylogenetic histories under structurally constrained substitution models
    Arenas, Miguel
    Dos Santos, Helena G.
    Posada, David
    Bastolla, Ugo
    [J]. BIOINFORMATICS, 2013, 29 (23) : 3020 - 3028
  • [4] Using Non-Reversible Context-Dependent Evolutionary Models to Study Substitution Patterns in Primate Non-Coding Sequences
    Baele, Guy
    Van de Peer, Yves
    Vansteelandt, Stijn
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 2010, 71 (01) : 34 - 50
  • [5] Root Digger: a root placement program for phylogenetic trees
    Bettisworth, Ben
    Stamatakis, Alexandros
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [6] Efficient likelihood computations with nonreversible models of evolution
    Boussau, Bastien
    Gouy, Manolo
    [J]. SYSTEMATIC BIOLOGY, 2006, 55 (05) : 756 - 768
  • [7] QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution
    Bui Quang Minh
    Cuong Cao Dang
    Le Sy Vinh
    Lanfear, Robert
    [J]. SYSTEMATIC BIOLOGY, 2021, 70 (05) : 1046 - 1060
  • [8] nQMaker: Estimating Time Nonreversible Amino Acid Substitution Models
    Cuong Cao Dang
    Bui Quang Minh
    McShea, Hanon
    Masel, Joanna
    James, Jennifer Eleanor
    Le Sy Vinh
    Lanfear, Robert
    [J]. SYSTEMATIC BIOLOGY, 2022, : 1110 - 1123
  • [9] FastMG: a simple, fast, and accurate maximum likelihood procedure to estimate amino acid replacement rate matrices from large data sets
    Cuong Cao Dang
    Vinh Sy Le
    Gascuel, Olivier
    Hazes, Bart
    Quang Si Le
    [J]. BMC BIOINFORMATICS, 2014, 15 : 341
  • [10] Estimating amino acid substitution models for metazoan evolutionary studies
    Dang, Cuong Cao
    Vinh, Le Sy
    [J]. JOURNAL OF EVOLUTIONARY BIOLOGY, 2023, 36 (03) : 499 - 506