YAKUSA: A fast structural database scanning method

被引:51
作者
Carpentier, M [1 ]
Brouillet, S [1 ]
Pothier, J [1 ]
机构
[1] Univ Paris 06, Atelier BioInformat, F-75005 Paris, France
关键词
protein structural similarities; protein internal coordinates; mixture transition distribution model;
D O I
10.1002/prot.20517
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
YAKUSA is a program designed for rapid scanning of a structural database with a query protein structure. It searches for the longest common substructures called SHSPs (structural high-scoring pairs) existing between a query structure and every structure in the structural database. It makes use of protein backbone internal coordinates (alpha angles) in order to describe protein structures as sequences of symbols. The structural similarities are established in 5 steps, the first 3 being analogous to those used in BLAST: (1) building up a deterministic finite automaton describing all patterns identical or similar to those in the query structure; (2) searching for all these patterns in every structure in the database; (3) extending the patterns to longer matching substructures (i.e., SHSPs); (4) selecting compatible SHSPs for each query-database structure pair; and (5) ranking the query- database structure pairs using 3 scores based on SHSP similarity, on SHSP probabilities, and on spatial compatibility of SHSPs. Structural fragment probabilities are estimated according to a mixture transition distribution model, which is an approximation of a high-order Markov chain model. With regard to sensitivity and selectivity of the structural matches, YAKUSA compares well to the best related programs, although it is by far faster: A typical database scan takes about 40 s CPU time on a desktop personal computer. It has also been implemented on a Web server for real-time searches.
引用
收藏
页码:137 / 151
页数:15
相关论文
共 58 条
  • [31] Lo Conte L, 2002, NUCLEIC ACIDS RES, V30, P264
  • [32] TOP:: a new method for protein structure comparisons and similarity searches
    Lu, GG
    [J]. JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2000, 33 : 176 - 183
  • [33] Interactive motif and fold recognition in protein structures
    Madsen, D
    Kleywegt, GJ
    [J]. JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2002, 35 : 137 - 139
  • [34] FROST: A filter-based fold recognition method
    Marin, A
    Pothier, J
    Zimmermann, K
    Gibrat, JF
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 49 (04) : 493 - 509
  • [35] The ups and downs of protein topology; rapid comparison of protein structure
    Martin, ACR
    [J]. PROTEIN ENGINEERING, 2000, 13 (12): : 829 - 837
  • [36] MURZIN AG, 1995, J MOL BIOL, V247, P536, DOI 10.1016/S0022-2836(05)80134-2
  • [37] Evaluation of protein fold comparison servers
    Novotny, M
    Madsen, D
    Kleywegt, GJ
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 54 (02) : 260 - 270
  • [38] EFFICIENT DETECTION OF 3-DIMENSIONAL STRUCTURAL MOTIFS IN BIOLOGICAL MACROMOLECULES BY COMPUTER VISION TECHNIQUES
    NUSSINOV, R
    WOLFSON, HJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1991, 88 (23) : 10495 - 10499
  • [39] ANALYSIS OF C-ALPHA GEOMETRY IN PROTEIN STRUCTURES
    OLDFIELD, TJ
    HUBBARD, RE
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 18 (04): : 324 - 337
  • [40] CATH - a hierarchic classification of protein domain structures
    Orengo, CA
    Michie, AD
    Jones, S
    Jones, DT
    Swindells, MB
    Thornton, JM
    [J]. STRUCTURE, 1997, 5 (08) : 1093 - 1108