Maximum likelihood molecular clock comb: Analytic solutions

被引:5
作者
Chor, Benny
Khetan, Amit
Snir, Sagi [1 ]
机构
[1] Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
[2] Tel Aviv Univ, Sch Comp Sci, IL-39040 Tel Aviv, Israel
[3] Univ Massachusetts, Dept Math, Amherst, MA 01003 USA
关键词
maximum likelihood; phylogenetic trees; analytic solutions; Hadamard conjugation; symbolic manipulation;
D O I
10.1089/cmb.2006.13.819
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Maximum likelihood (ML) is increasingly used as an optimality criterion for selecting evolutionary trees, but finding the global optimum is a hard computational task. Because no general analytic solution is known, numeric techniques such as hill climbing or expectation maximization (EM), are used in order to find optimal parameters for a given tree. So far, analytic solutions were derived only for the simplest model-three taxa, two state characters, under a molecular clock. Four taxa rooted trees have two topologies-the fork (two subtrees with two leaves each) and the comb (one subtree with three leaves, the other with a single leaf). In a previous work, we devised a closed form analytic solution for the ML molecular clock fork. In this work, we extend the state of the art in the area of analytic solutions ML trees to the family of all four taxa trees under the molecular clock assumption. The change from the fork topology to the comb incurs a major increase in the complexity of the underlying algebraic system and requires novel techniques and approaches. We combine the ultrametric properties of molecular clock trees with the Hadamard conjugation to derive a number of topology dependent identities. Employing these identities, we substantially simplify the system of polynomial equations. We finally use tools from algebraic geometry (e.g., Grobner bases, ideal saturation, resultants) and employ symbolic algebra software to obtain analytic solutions for the comb. We show that in contrast to the fork, the comb has no closed form solutions (expressed by radicals in the input data). In general, four taxa trees can have multiple ML points. In contrast, we can now prove that under the molecular clock assumption, the comb has a unique (local and global) ML point. (Such uniqueness was previously shown for the fork.)
引用
收藏
页码:819 / 837
页数:19
相关论文
共 13 条
  • [1] [Anonymous], 1976, ALGEBRAIC GEOMETRY
  • [2] [Anonymous], 1971, STAT DECISION THEORY
  • [3] Molecular clock fork phylogenies: Closed form analytic maximum likelihood solutions
    Chor, B
    Snir, S
    [J]. SYSTEMATIC BIOLOGY, 2004, 53 (06) : 963 - 967
  • [4] Multiple maxima of likelihood in phylogenetic trees: An analytic approach
    Chor, B
    Hendy, MD
    Holland, BR
    Penny, D
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (10) : 1529 - 1541
  • [5] CHOR B, 2003, P 7 ANN INT C COMP M, P76
  • [6] CHOR B, 2001, WABI 2001
  • [7] Gelfand I. M., 1994, Mathematics Theory & Applications, DOI DOI 10.1007/978-0-8176-4771-1
  • [8] A DISCRETE FOURIER-ANALYSIS FOR EVOLUTIONARY TREES
    HENDY, MD
    PENNY, D
    STEEL, MA
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (08) : 3339 - 3343
  • [9] SPECTRAL-ANALYSIS OF PHYLOGENETIC DATA
    HENDY, MD
    [J]. JOURNAL OF CLASSIFICATION, 1993, 10 (01) : 5 - 24
  • [10] Jukes TH, 1969, MAMMALIAN PROTEIN ME, P21, DOI [DOI 10.1016/B978-1-4832-3211-9.50009-7, DOI 10.1093/BIOINFORMATICS/BTM404]