FINDING FLEXIBLE PATTERNS IN UNALIGNED PROTEIN SEQUENCES

被引:212
作者
JONASSEN, I
COLLINS, JF
HIGGINS, DG
机构
[1] UNIV EDINBURGH, INST CELL & MOLEC BIOL, BIOCOMP RES UNIT, EDINBURGH EH9 3JR, MIDLOTHIAN, SCOTLAND
[2] EUROPEAN BIOINFORMAT INST, CAMBRIDGE CB10 1RQ, ENGLAND
关键词
ALGORITHM; FLEXIBLE GAPS; PATTERNS; PROTEIN FAMILIES; PROSITE;
D O I
10.1002/pro.5560040817
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a new method for the identification of conserved patterns in a set of unaligned related protein sequences. It is able to discover patterns of a quite general form, allowing for both ambiguous positions and for variable length wildcard regions. It allows the user to define a class of patterns (e.g., the degree of ambiguity allowed and the length and number of gaps), and the method is then guaranteed to find the conserved patterns in this class scoring highest according to a significance measure defined. Identified patterns may be refined using one of two new algorithms. We present a new (nonstatistical) significance measure for flexible patterns. The method is shown to recover known motifs for PROSITE families and is also applied to some recently described families from the literature.
引用
收藏
页码:1587 / 1595
页数:9
相关论文
共 23 条
[1]   THE PHD FINGER - IMPLICATIONS FOR CHROMATIN-MEDIATED TRANSCRIPTIONAL REGULATION [J].
AASLAND, R ;
GIBSON, TJ ;
STEWART, AF .
TRENDS IN BIOCHEMICAL SCIENCES, 1995, 20 (02) :56-59
[2]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1992, 20 :2019-2022
[3]   A COMPREHENSIVE SET OF SEQUENCE-ANALYSIS PROGRAMS FOR THE VAX [J].
DEVEREUX, J ;
HAEBERLI, P ;
SMITHIES, O .
NUCLEIC ACIDS RESEARCH, 1984, 12 (01) :387-395
[4]   IMPROVED DETECTION OF HELIX-TURN-HELIX DNA-BINDING MOTIFS IN PROTEIN SEQUENCES [J].
DODD, IB ;
EGAN, JB .
NUCLEIC ACIDS RESEARCH, 1990, 18 (17) :5019-5026
[5]  
ETZOLD T, 1993, COMPUT APPL BIOSCI, V9, P49
[6]  
FUCHS R, 1994, COMPUT APPL BIOSCI, V10, P171
[7]   AUTOMATED ASSEMBLY OF PROTEIN BLOCKS FOR DATABASE SEARCHING [J].
HENIKOFF, S ;
HENIKOFF, JG .
NUCLEIC ACIDS RESEARCH, 1991, 19 (23) :6565-6572
[8]   METHODS FOR ASSESSING THE STATISTICAL SIGNIFICANCE OF MOLECULAR SEQUENCE FEATURES BY USING GENERAL SCORING SCHEMES [J].
KARLIN, S ;
ALTSCHUL, SF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (06) :2264-2268
[9]   DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT [J].
LAWRENCE, CE ;
ALTSCHUL, SF ;
BOGUSKI, MS ;
LIU, JS ;
NEUWALD, AF ;
WOOTTON, JC .
SCIENCE, 1993, 262 (5131) :208-214
[10]   SH3 - AN ABUNDANT PROTEIN DOMAIN IN SEARCH OF A FUNCTION [J].
MUSACCHIO, A ;
GIBSON, T ;
LEHTO, VP ;
SARASTE, M .
FEBS LETTERS, 1992, 307 (01) :55-61