Detection of eukaryotic promoters using Markov transition matrices

被引:47
作者
Audic, S
Claverie, JM
机构
[1] Struct. and Genetic Info. Laboratory, C.N.R.S.-E.P. 91, Inst. Struct. Biol. and Microbiol., Marseille 13402
来源
COMPUTERS & CHEMISTRY | 1997年 / 21卷 / 04期
关键词
D O I
10.1016/S0097-8485(96)00040-X
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Eukaryotic promoters are among the most important functional domains yet to be characterized in a satisfactory manner in genomic sequences. Most current detection methods rely on the recognition of individual transcription elements using position-weight matrices (PWM) or consensus sequences. Here, we study a simple promoter detection algorithm based on Markov transition matrices built from sequences upward from proven transcription initiation sites. The performances have been evaluated on the training set and on a test set of promoter-containing sequences. The results on the training set are surprisingly good, given that the algorithm does not incorporate any specific knowledge about promoters. Yet, the program exhibits the pathological behaviour typical of all training set-based methods: a significant decline in performance when confronted with previously unseen sequences. Thus, the Markov algorithm, like the others presently available, does not truly capture the essence of eukaryotic promoters. A detection program based on a Markov model is likely to be blind to categories of promoters without close representatives in the training set. (C) 1997 Elsevier Science Ltd.
引用
收藏
页码:223 / 227
页数:5
相关论文
共 17 条
[1]   APPLICATION OF A NEW METHOD OF PATTERN-RECOGNITION IN DNA-SEQUENCE ANALYSIS - A STUDY OF ESCHERICHIA-COLI PROMOTERS [J].
ALEXANDROV, NN ;
MIRONOV, AA .
NUCLEIC ACIDS RESEARCH, 1990, 18 (07) :1847-1852
[2]   GENMARK - PARALLEL GENE RECOGNITION FOR BOTH DNA STRANDS [J].
BORODOVSKY, M ;
MCININCH, J .
COMPUTERS & CHEMISTRY, 1993, 17 (02) :123-133
[3]   WEIGHT MATRIX DESCRIPTIONS OF 4 EUKARYOTIC RNA POLYMERASE-II PROMOTER ELEMENTS DERIVED FROM 502 UNRELATED PROMOTER SEQUENCES [J].
BUCHER, P .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 212 (04) :563-578
[4]  
BUCHER P, 1996, EUKARYOTIC PROMOTER
[5]   The difficulty of identifying genes in anonymous vertebrate sequences [J].
Claverie, JM ;
Poirot, O ;
Lopez, F .
COMPUTERS & CHEMISTRY, 1997, 21 (04) :203-214
[6]  
Claverie JM, 1996, COMPUT APPL BIOSCI, V12, P431
[7]  
CLAVERIE JM, 1985, COMPUT APPL BIOSCI, V1, P95
[8]  
CLAVERIE JM, 1990, METHOD ENZYMOL, V183, P237
[9]   NEURAL NETWORK OPTIMIZATION FOR ESCHERICHIA-COLI PROMOTER PREDICTION [J].
DEMELER, B ;
ZHOU, GW .
NUCLEIC ACIDS RESEARCH, 1991, 19 (07) :1593-1599
[10]   ASSESSMENT OF PROTEIN CODING MEASURES [J].
FICKETT, JW ;
TUNG, CS .
NUCLEIC ACIDS RESEARCH, 1992, 20 (24) :6441-6450