A gene structure prediction program using duration HMM

被引:0
作者
Tae, Hongseok [1 ,2 ]
Kong, Eun-Bae [2 ]
Park, Kiejung [1 ]
机构
[1] SmallSoft Co ltd, Inst Informat Technol, Jang Dong 59-5, Taejon 305811, South Korea
[2] Chungnam Natl Univ, Dept Comp Engn, Taejon 305764, South Korea
来源
DATA MINING AND BIOINFORMATICS | 2006年 / 4316卷
关键词
HMM; gene prediction; GeneChaser;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Gene structure prediction, which is to predict protein coding regions in a given nucleotide sequence, is a critical process in annotating genes and greatly affects gene analysis and genome annotation. As the gene structure of eukaryotes is much more complicated than that of prokaryotic genes, eukaryotic gene structure prediction should have more diverse and more complicated computational models. We have developed GeneChaser, a gene structure prediction program, using a duration hidden markov model. GeneChaser consists of two major processes, one of which is to train datasets to produce parameter values and the other of which is to predict protein coding regions based on the parameter values. The program predicts multiple genes rather than a single gene from a DNA sequence. To predict the gene structure for a huge chromosomal DNA sequence, it splits the sequence into overlapped fragments and performs prediction process for each fragment. A few computational models were implemented to detect signal patterns and their scanning efficiency was evaluated. Based on a few criteria, its prediction performance was compared with that of a few commonly used programs, GeneID and Morgan.
引用
收藏
页码:146 / +
页数:2
相关论文
共 23 条
[1]   PERIODICITIES IN CODING AND NONCODING REGIONS OF THE GENES [J].
ARQUES, DG ;
MICHEL, CJ .
JOURNAL OF THEORETICAL BIOLOGY, 1990, 143 (03) :307-318
[2]   GENMARK - PARALLEL GENE RECOGNITION FOR BOTH DNA STRANDS [J].
BORODOVSKY, M ;
MCININCH, J .
COMPUTERS & CHEMISTRY, 1993, 17 (02) :123-133
[3]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[4]   HEURISTIC INFORMATIONAL ANALYSIS OF SEQUENCES [J].
CLAVERIE, JM ;
BOUGUELERET, L .
NUCLEIC ACIDS RESEARCH, 1986, 14 (01) :179-196
[5]   Improved microbial gene identification with GLIMMER [J].
Delcher, AL ;
Harmon, D ;
Kasif, S ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 1999, 27 (23) :4636-4641
[6]   GENE STRUCTURE PREDICTION BY LINGUISTIC METHODS [J].
DONG, S ;
SEARLS, DB .
GENOMICS, 1994, 23 (03) :540-551
[7]   RECOGNITION OF PROTEIN CODING REGIONS IN DNA-SEQUENCES [J].
FICKETT, JW .
NUCLEIC ACIDS RESEARCH, 1982, 10 (17) :5303-5318
[8]  
FIELDS CA, 1990, COMPUT APPL BIOSCI, V6, P263
[9]   PREDICTION OF THE EXON-INTRON STRUCTURE BY A DYNAMIC-PROGRAMMING APPROACH [J].
GELFAND, MS ;
ROYTBERG, MA .
BIOSYSTEMS, 1993, 30 (1-3) :173-182
[10]   PREDICTION OF GENE STRUCTURE [J].
GUIGO, R ;
KNUDSEN, S ;
DRAKE, N ;
SMITH, T .
JOURNAL OF MOLECULAR BIOLOGY, 1992, 226 (01) :141-157