PROTEIN FAMILY CLASSIFICATION BASED ON SEARCHING A DATABASE OF BLOCKS

被引:328
作者
HENIKOFF, S
HENIKOFF, JG
机构
[1] Howard Hughes Medical Institute, Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle
[2] Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle
关键词
D O I
10.1006/geno.1994.1018
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The most highly conserved regions of proteins can be represented as ''blocks'' of locally aligned sequence segments. Previously, an automated system was introduced to generate a database of blocks that is searched for local similarities using a sequence query. Here, we describe a method for searching this database that can also reveal significant global similarities. Local and global alignments are scored independently, so they can be used in concert to infer homology. A set of 7082 diverse sequences not represented in the database provided queries for testing this approach. The resulting distributions of scores led to guidelines for interpretation of search data and to the classification of 289 uncatalogued sequences into known groups. Thirty-eight of these relationships appear to be new discoveries. We also show how searching a database of blocks can be used to detect repeated domains and to find distinct cross-family relationships that were missed in searches of sequence databases. (C) 1994 Academic Press, Inc.
引用
收藏
页码:97 / 107
页数:11
相关论文
共 44 条
[1]   PROTEIN DATABASE SEARCHES FOR MULTIPLE ALIGNMENTS [J].
ALTSCHUL, SF ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (14) :5509-5513
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1992, 20 :2019-2022
[4]   PROSITE - A DICTIONARY OF SITES AND PATTERNS IN PROTEINS [J].
BAIROCH, A .
NUCLEIC ACIDS RESEARCH, 1992, 20 :2013-2018
[5]   COMPREHENSIVE SEQUENCE-ANALYSIS OF THE 182 PREDICTED OPEN READING FRAMES OF YEAST CHROMOSOME-III [J].
BORK, P ;
OUZOUNIS, C ;
SANDER, C ;
SCHARF, M ;
SCHNEIDER, R ;
SONNHAMMER, E .
PROTEIN SCIENCE, 1992, 1 (12) :1677-1690
[6]   WHATS IN A GENOME [J].
BORK, P ;
OUZOUNIS, C ;
SANDER, C ;
SCHARF, M ;
SCHNEIDER, R ;
SONNHAMMER, E .
NATURE, 1992, 358 (6384) :287-287
[7]   AN EXPANDING FAMILY OF HELICASES WITHIN THE DEAD/H SUPERFAMILY [J].
BORK, P ;
KOONIN, EV .
NUCLEIC ACIDS RESEARCH, 1993, 21 (03) :751-752
[8]   A PUTATIVE ATP BINDING-PROTEIN INFLUENCES THE FIDELITY OF BRANCHPOINT RECOGNITION IN YEAST SPLICING [J].
BURGESS, S ;
COUTO, JR ;
GUTHRIE, C .
CELL, 1990, 60 (05) :705-717
[9]   GMC OXIDOREDUCTASES - A NEWLY DEFINED FAMILY OF HOMOLOGOUS PROTEINS WITH DIVERSE CATALYTIC ACTIVITIES [J].
CAVENER, DR .
JOURNAL OF MOLECULAR BIOLOGY, 1992, 223 (03) :811-814
[10]  
COLLINS JF, 1990, METHOD ENZYMOL, V183, P474