Tree pattern matching in phylogenetic trees:: automatic search for orthologs or paralogs in homologous gene sequence databases

被引:122
作者
Dufayard, JF
Duret, L
Penel, S
Gouy, M
Rechenmann, F
Perrière, G
机构
[1] Univ Lyon 1, CNRS, UMR 5558, Lab Biometrie & Biol Evolut, F-69688 Villeurbanne, France
[2] INRIA Rhone Alpes, Montbonnot St Martin, St Ismier, France
关键词
D O I
10.1093/bioinformatics/bti325
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Comparative sequence analysis is widely used to study genome function and evolution. This approach first requires the identification of homologous genes and then the interpretation of their homology relationships (orthology or paralogy). To provide help in this complex task, we developed three databases of homologous genes containing sequences, multiple alignments and phylogenetic trees: HOBACGEN, HOVERGEN and HOGENOM. In this paper, we present two new tools for automating the search for orthologs or paralogs in these databases. Results: First, we have developed and implemented an algorithm to infer speciation and duplication events by comparison of gene and species trees (tree reconciliation). Second, we have developed a general method to search in our databases the gene families for which the tree topology matches a peculiar tree pattern. This algorithm of unordered tree pattern matching has been implemented in the FamFetch graphical interface. With the help of a graphical editor, the user can specify the topology of the tree pattern, and set constraints on its nodes and leaves. Then, this pattern is compared with all the phylogenetic trees of the database, to retrieve the families in which one or several occurrences of this pattern are found. By specifying ad hoc patterns, it is therefore possible to identify orthologs in our databases.
引用
收藏
页码:2596 / 2603
页数:8
相关论文
共 23 条
  • [11] Unusual horizontal transfer of a long interspersed nuclear element between distant vertebrate classes
    Kordis, D
    Gubensek, F
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (18) : 10704 - 10709
  • [12] Lynch M, 2001, GENETICS, V159, P1789
  • [13] From gene trees to species trees
    Ma, B
    Li, M
    Zhang, LX
    [J]. SIAM JOURNAL ON COMPUTING, 2000, 30 (03) : 729 - 752
  • [14] Lateral gene transfer and the nature of bacterial innovation
    Ochman, H
    Lawrence, JG
    Groisman, EA
    [J]. NATURE, 2000, 405 (6784) : 299 - 304
  • [15] From gene to organismal phylogeny: Reconciled trees and the gene tree species tree problem
    Page, RDM
    Charleston, MA
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 1997, 7 (02) : 231 - 240
  • [16] HOBACGEN:: Database system for comparative genomics in bacteria
    Perrière, G
    Duret, L
    Gouy, M
    [J]. GENOME RESEARCH, 2000, 10 (03) : 379 - 385
  • [17] Automatic clustering of orthologs and in-paralogs from pairwise species comparisons
    Remm, M
    Storm, CEV
    Sonnhammer, ELL
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 314 (05) : 1041 - 1052
  • [18] Orthology, paralogy and proposed classification for paralog subtypes
    Sonnhammer, ELL
    Koonin, EV
    [J]. TRENDS IN GENETICS, 2002, 18 (12) : 619 - 620
  • [19] Automated ortholog inference from phylogenetic trees and calculation of orthology reliability
    Storm, CEV
    Sonnhammer, ELL
    [J]. BIOINFORMATICS, 2002, 18 (01) : 92 - 99
  • [20] The COG database: new developments in phylogenetic classification of proteins from complete genomes
    Tatusov, RL
    Natale, DA
    Garkavtsev, IV
    Tatusova, TA
    Shankavaram, UT
    Rao, BS
    Kiryutin, B
    Galperin, MY
    Fedorova, ND
    Koonin, EV
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 22 - 28