Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model

被引:460
作者
Lartillot, Nicolas
Brinkmann, Henner
Philippe, Herve
机构
[1] Univ Montpellier 2, Lab Informat Robot & Microelect Montpellier, CNRS, UMR 5506, F-34392 Montpellier 5, France
[2] Univ Montreal, Dept Biochim, Canadian Inst Adv Res, Montreal, PQ H3C 3J7, Canada
关键词
D O I
10.1186/1471-2148-7-S1-S4
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Thanks to the large amount of signal contained in genome-wide sequence alignments, phylogenomic analyses are converging towards highly supported trees. However, high statistical support does not imply that the tree is accurate. Systematic errors, such as the Long Branch Attraction ( LBA) artefact, can be misleading, in particular when the taxon sampling is poor, or the outgroup is distant. In an otherwise consistent probabilistic framework, systematic errors in genome-wide analyses can be traced back to model mis-specification problems, which suggests that better models of sequence evolution should be devised, that would be more robust to tree reconstruction artefacts, even under the most challenging conditions. Methods: We focus on a well characterized LBA artefact analyzed in a previous phylogenomic study of the metazoan tree, in which two fast-evolving animal phyla, nematodes and platyhelminths, emerge either at the base of all other Bilateria, or within protostomes, depending on the outgroup. We use this artefactual result as a case study for comparing the robustness of two alternative models: a standard, site-homogeneous model, based on an empirical matrix of amino-acid replacement ( WAG), and a site-heterogeneous mixture model ( CAT). In parallel, we propose a posterior predictive test, allowing one to measure how well a model acknowledges sequence saturation. Results: Adopting a Bayesian framework, we show that the LBA artefact observed under WAG disappears when the site-heterogeneous model CAT is used. Using cross-validation, we further demonstrate that CAT has a better statistical fit than WAG on this data set. Finally, using our statistical goodness-of-fit test, we show that CAT, but not WAG, correctly accounts for the overall level of saturation, and that this is due to a better estimation of site-specific amino-acid preferences. Conclusion: The CAT model appears to be more robust than WAG against LBA artefacts, essentially because it correctly anticipates the high probability of convergences and reversions implied by the small effective size of the amino-acid alphabet at each site of the alignment. More generally, our results provide strong evidence that site-specificities in the substitution process need be accounted for in order to obtain more reliable phylogenetic trees.
引用
收藏
页数:14
相关论文
共 53 条
  • [1] Evidence for a clade of nematodes, arthropods and other moulting animals
    Aguinaldo, AMA
    Turbeville, JM
    Linford, LS
    Rivera, MC
    Garey, JR
    Raff, RA
    Lake, JA
    [J]. NATURE, 1997, 387 (6632) : 489 - 493
  • [2] The evolutionary position of nematodes
    Blair, Jaime E.
    Ikeo, Kazuho
    Gojobori, Takashi
    Hedges, S. Blair
    [J]. BMC EVOLUTIONARY BIOLOGY, 2002, 2 (1)
  • [3] Bayesian model adequacy and choice in phylogenetics
    Bollback, JP
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (07) : 1171 - 1180
  • [4] An empirical assessment of long-branch attraction artefacts in deep eukaryotic phylogenomics
    Brinkmann, H
    Van der Giezen, M
    Zhou, Y
    De Raucourt, GP
    Philippe, H
    [J]. SYSTEMATIC BIOLOGY, 2005, 54 (05) : 743 - 757
  • [5] Modeling residue usage in aligned protein sequences via maximum likelihood
    Bruno, WJ
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (10) : 1368 - 1374
  • [6] Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency
    Chang, JT
    [J]. MATHEMATICAL BIOSCIENCES, 1996, 137 (01) : 51 - 73
  • [7] An alternative model of amino acid replacement
    Crooks, GE
    Brenner, SE
    [J]. BIOINFORMATICS, 2005, 21 (07) : 975 - 980
  • [8] Phylogenomics and the reconstruction of the tree of life
    Delsuc, F
    Brinkmann, H
    Philippe, H
    [J]. NATURE REVIEWS GENETICS, 2005, 6 (05) : 361 - 375
  • [9] CASES IN WHICH PARSIMONY OR COMPATIBILITY METHODS WILL BE POSITIVELY MISLEADING
    FELSENSTEIN, J
    [J]. SYSTEMATIC ZOOLOGY, 1978, 27 (04): : 401 - 410
  • [10] MAXIMUM LIKELIHOOD AND MINIMUM-STEPS METHODS FOR ESTIMATING EVOLUTIONARY TREES FROM DATA ON DISCRETE CHARACTERS
    FELSENSTEIN, J
    [J]. SYSTEMATIC ZOOLOGY, 1973, 22 (03): : 240 - 249