EXPECTATION MAXIMIZATION ALGORITHM FOR IDENTIFYING PROTEIN-BINDING SITES WITH VARIABLE LENGTHS FROM UNALIGNED DNA FRAGMENTS

被引:85
作者
CARDON, LR [1 ]
STORMO, GD [1 ]
机构
[1] UNIV COLORADO, DEPT MOLEC CELLULAR & DEV BIOL, BOULDER, CO 80309 USA
关键词
PROMOTERS; DNA-PROTEIN; EXPECTATION MAXIMUM; MULTIPLE ALIGNMENT; CONSENSUS SEQUENCES;
D O I
10.1016/0022-2836(92)90723-W
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An Expectation Maximization algorithm for identification of DNA binding sites is presented. The approach predicts the location of binding regions while allowing variable length spacers within the sites. In addition to predicting the most likely spacer length for a set of DNA fragments, the method identifies individual sites that differ in spacer size. No alignment of DNA sequences is necessary. The method is illustrated by application to 231 Escherichia coli DNA fragments known to contain promoters with variable spacings between their consensus regions. Maximum-likelihood tests of the differences between the spacing classes indicate that the consensus regions of the spacing classes are not distinct. Further tests suggest that several positions within the spacing region may contribute to promoter specificity. © 1992.
引用
收藏
页码:159 / 170
页数:12
相关论文
共 27 条
[1]   SELECTION OF DNA-BINDING SITES BY REGULATORY PROTEINS - STATISTICAL-MECHANICAL THEORY AND APPLICATION TO OPERATORS AND PROMOTERS [J].
BERG, OG ;
VONHIPPEL, PH .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) :723-743
[2]  
BEYER WH, 1988, HDB TABLES PROBABILI
[3]   PROMOTERS OF ESCHERICHIA-COLI - A HIERARCHY OF INVIVO STRENGTH INDICATES ALTERNATE STRUCTURES [J].
DEUSCHLE, U ;
KAMMERER, W ;
GENTZ, R ;
BUJARD, H .
EMBO JOURNAL, 1986, 5 (11) :2987-2994
[4]  
Edwards A. F., 1972, LIKELIHOOD
[5]  
Feinberg SE, 1975, DISCRETE MULTIVARIAT
[6]   RIGOROUS PATTERN-RECOGNITION METHODS FOR DNA-SEQUENCES - ANALYSIS OF PROMOTER SEQUENCES FROM ESCHERICHIA-COLI [J].
GALAS, DJ ;
EGGERT, M ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1985, 186 (01) :117-128
[7]   ANALYSIS OF ESCHERICHIA-COLI PROMOTER SEQUENCES [J].
HARLEY, CB ;
REYNOLDS, RP .
NUCLEIC ACIDS RESEARCH, 1987, 15 (05) :2343-2361
[8]   COMPILATION AND ANALYSIS OF ESCHERICHIA-COLI PROMOTER DNA-SEQUENCES [J].
HAWLEY, DK ;
MCCLURE, WR .
NUCLEIC ACIDS RESEARCH, 1983, 11 (08) :2237-2255
[9]  
HERTZ GZ, 1990, COMPUT APPL BIOSCI, V6, P81
[10]  
KENDALL MG, 1977, ADV THEORY STATISTIC, V1