Saturated BLAST: an automated multiple intermediate sequence search used to detect distant homology

被引:48
作者
Li, WZ
Pio, F
Pawlowski, K
Godzik, A [1 ]
机构
[1] Burnham Inst, La Jolla, CA 92037 USA
[2] San Diego Supercomp Ctr, La Jolla, CA 92093 USA
[3] Simon Fraser Univ, Burnaby, BC V5A 1S6, Canada
关键词
D O I
10.1093/bioinformatics/16.12.1105
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Two proteins can have a similar 3-dimensional structure and biological function, but have sequences sufficiently different that traditional protein sequence comparison algorithms do not identify their relationship. The desire to identify such relations has led to the development of more sensitive sequence alignment strategies. One such strategy is the Intermediate Sequence Search (ISS), which connects two proteins through one or more intermediate sequences. In its brute-force implementation, 133 is a strategy that repetitively uses the results of the previous query as new search seeds, making it time-consuming and difficult to analyze. Results: Saturated BLAST is a package that performs ISS in an efficient and automated manner: It was developed using Perl and Perl/Tk and implemented on the LINUX operating system. Starting with a protein sequence, Saturated BLAST runs a BLAST search and identifies representative sequences for the next generation of searches. The procedure is run until convergence or until some predefined criteria are met. Saturated BLAST has a friendly graphic user interface, a built-in BLAST result parser; several multiple alignment tools, clustering algorithms and various filters for the elimination of false positives, thereby providing an easy way to edit, visualize, analyze, monitor and control the search. Besides detecting remote homologies, Saturated BLAST can be used to maintain protein family databases and to search for new genes in genomic databases.
引用
收藏
页码:1105 / 1110
页数:6
相关论文
共 21 条
  • [11] Hidden Markov models for detecting remote protein homologies
    Karplus, K
    Barrett, C
    Hughey, R
    [J]. BIOINFORMATICS, 1998, 14 (10) : 846 - 856
  • [12] HIDDEN MARKOV-MODELS IN COMPUTATIONAL BIOLOGY - APPLICATIONS TO PROTEIN MODELING
    KROGH, A
    BROWN, M
    MIAN, IS
    SJOLANDER, K
    HAUSSLER, D
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (05) : 1501 - 1531
  • [13] Predicting protein three-dimensional structure
    Moult, J
    [J]. CURRENT OPINION IN BIOTECHNOLOGY, 1999, 10 (06) : 583 - 588
  • [14] MURZIN AG, 1995, J MOL BIOL, V247, P536, DOI 10.1016/S0022-2836(05)80134-2
  • [15] Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods
    Park, J
    Karplus, K
    Barrett, C
    Hughey, R
    Haussler, D
    Hubbard, T
    Chothia, C
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 284 (04) : 1201 - 1210
  • [16] Intermediate sequences increase the detection of homology between sequences
    Park, J
    Teichmann, SA
    Hubbard, T
    Chothia, C
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 273 (01) : 349 - 354
  • [17] IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON
    PEARSON, WR
    LIPMAN, DJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (08) : 2444 - 2448
  • [18] Comparison of sequence profiles. Strategies for structural predictions using sequence information
    Rychlewski, L
    Jaroszewski, L
    Li, WZ
    Godzik, A
    [J]. PROTEIN SCIENCE, 2000, 9 (02) : 232 - 241
  • [19] Combining sensitive database searches with multiple intermediates to detect distant homologues
    Salamov, AA
    Suwa, M
    Orengo, CA
    Swindells, MB
    [J]. PROTEIN ENGINEERING, 1999, 12 (02): : 95 - 100
  • [20] Protein structure alignment by incremental combinatorial extension (CE) of the optimal path
    Shindyalov, IN
    Bourne, PE
    [J]. PROTEIN ENGINEERING, 1998, 11 (09): : 739 - 747