SIMAP:: the similarity matrix of proteins

被引:38
作者
Rattei, Thomas
Arnold, Roland
Tischler, Patrick
Lindner, Dominik
Stumpflen, Volker
Mewes, H. Werner
机构
[1] Tech Univ Munich, Dept Genome Oriented Bioinformat, D-85350 Freising Weihenstephan, Germany
[2] GSF, Natl Res Ctr Environm & Hlth, Inst Bioinformat, D-85764 Neuherberg, Germany
关键词
D O I
10.1093/nar/gkj106
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Similarity Matrix of Proteins (SIMAP) (http://mips.gsf.de/simap) provides a database based on a precomputed similarity matrix covering the similarity space formed by > 4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith-Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches.
引用
收藏
页码:D252 / D256
页数:5
相关论文
共 28 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[3]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[4]  
BENNER SA, 1992, SCIENCE, V257, P1609, DOI 10.1126/science.257.5077.1609-a
[5]   NBLAST: a cluster variant of BLAST for NxN comparisons [J].
Dumontier, M ;
Hogue, CWV .
BMC BIOINFORMATICS, 2002, 3 (1)
[6]   Predicting subcellular localization of proteins based on their N-terminal amino acid sequence [J].
Emanuelsson, O ;
Nielsen, H ;
Brunak, S ;
von Heijne, G .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 300 (04) :1005-1016
[7]   PATTERNS OF NUCLEOTIDE SUBSTITUTION IN PSEUDOGENES AND FUNCTIONAL GENES [J].
GOJOBORI, T ;
LI, WH ;
GRAUR, D .
JOURNAL OF MOLECULAR EVOLUTION, 1982, 18 (05) :360-369
[8]   CYGD:: the comprehensive Yeast Genome Database [J].
Güldener, U ;
Münsterkötter, M ;
Kastenmüller, G ;
Strack, N ;
van Helden, J ;
Lemer, C ;
Richelles, J ;
Wodak, SJ ;
García-Martínez, J ;
Pérez-Ortín, JE ;
Michael, H ;
Kaps, A ;
Talla, E ;
Dujon, B ;
André, B ;
Souciet, JL ;
De Montigny, J ;
Bon, E ;
Gaillardin, C ;
Mewes, HW .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D364-D368
[9]   SYSTERS, GeneNest, SpliceNest: exploring sequence space from genome to protein [J].
Krause, A ;
Haas, SA ;
Coward, E ;
Vingron, M .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :299-300
[10]   CluSTr: a database of clusters of SWISS-PROT plus TrEMBL proteins [J].
Kriventseva, EV ;
Fleischmann, W ;
Zdobnov, EM ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :33-36