A new, fast algorithm for detecting protein coevolution using maximum compatible cliques

被引:13
作者
Rodionov, Alex [1 ]
Bezginov, Alexandr [2 ]
Rose, Jonathan [1 ]
Tillier, Elisabeth R. M. [2 ,3 ]
机构
[1] Univ Toronto, Edward S Rogers Sr Dept Elect & Comp Engn, Toronto, ON, Canada
[2] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[3] Univ Hlth Network, Ontario Canc Inst, Toronto, ON M5G 1L7, Canada
关键词
PHYLOGENETIC TREES; INDICATOR;
D O I
10.1186/1748-7188-6-17
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The MatrixMatchMaker algorithm was recently introduced to detect the similarity between phylogenetic trees and thus the coevolution between proteins. MMM finds the largest common submatrices between pairs of phylogenetic distance matrices, and has numerous advantages over existing methods of coevolution detection. However, these advantages came at the cost of a very long execution time. Results: In this paper, we show that the problem of finding the maximum submatrix reduces to a multiple maximum clique subproblem on a graph of protein pairs. This allowed us to develop a new algorithm and program implementation, MMMvII, which achieved more than 600x speedup with comparable accuracy to the original MMM. Conclusions: MMMvII will thus allow for more more extensive and intricate analyses of coevolution.
引用
收藏
页数:9
相关论文
共 23 条
  • [1] OMA 2011: orthology inference among 1000 complete genomes
    Altenhoff, Adrian M.
    Schneider, Adrian
    Gonnet, Gaston H.
    Dessimoz, Christophe
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D289 - D294
  • [2] [Anonymous], 2004, Inferring phylogenies
  • [3] [Anonymous], 2005, PHYLIP (phylogeny inference package) version 3.6
  • [4] Phylogenetic identification of lateral genetic transfer events
    Beiko, RG
    Hamilton, N
    [J]. BMC EVOLUTIONARY BIOLOGY, 2006, 6 (1) : 17P
  • [5] CLARK GW, 2011, NETWORK BIOL
  • [6] Jothi R, 2005, BIOINFORMATICS, V21, pI241, DOI 10.1093/bioinformatics/bti1009
  • [7] Karp R.M., 1972, PLENUM PRESS SURV ST, P85, DOI 10.1007/978-1-4684-2001-2_9
  • [8] Recent developments in the MAFFT multiple sequence alignment program
    Katoh, Kazutaka
    Toh, Hiroyuki
    [J]. BRIEFINGS IN BIOINFORMATICS, 2008, 9 (04) : 286 - 298
  • [9] KUHNER MK, 1994, MOL BIOL EVOL, V11, P459
  • [10] Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
    Li, Weizhong
    Godzik, Adam
    [J]. BIOINFORMATICS, 2006, 22 (13) : 1658 - 1659