Efficient implementation of a generalized pair hidden Markov model for comparative gene finding

被引:18
作者
Majoros, WH [1 ]
Pertea, M [1 ]
Salzberg, SL [1 ]
机构
[1] Inst Genom Res, Bioinformat Dept, Rockville, MD USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/bti297
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The increased availability of genome sequences of closely related organisms has generated much interest in utilizing homology to improve the accuracy of gene prediction programs. Generalized pair hidden Markov models (GPHMMs) have been proposed as one means to address this need. However, all GPHMM implementations currently available are either closed-source or the details of their operation are not fully described in the literature, leaving a significant hurdle for others wishing to advance the state of the art in GPHMM design. Results: We have developed an open-source GPHMM gene finder, TWAIN, which performs very well on two related Aspergillus species, A.fumigatus and A.nidulans, finding 89% of the exons and predicting 74% of the gene models exactly correctly in a test set of 147 conserved gene pairs. We describe the implementation of this GPHMM and we explicitly address the assumptions and limitations of the system. We suggest possible ways of relaxing those assumptions to improve the utility of the system without sacrificing efficiency beyond what is practical.
引用
收藏
页码:1782 / 1788
页数:7
相关论文
共 26 条
[1]   SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model [J].
Alexandersson, M ;
Cawley, S ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (03) :496-502
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 1997, THESIS STANFORD U ST
[4]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[5]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[6]   Phat -: a gene finding program for Plasmodium falciparum [J].
Cawley, SE ;
Wirth, AI ;
Speed, TP .
MOLECULAR AND BIOCHEMICAL PARASITOLOGY, 2001, 118 (02) :167-174
[7]  
Cormen T. H., 1990, INTRO ALGORITHMS
[8]   Sequencing and comparison of yeast species to identify genes and regulatory elements [J].
Kellis, M ;
Patterson, N ;
Endrizzi, M ;
Birren, B ;
Lander, ES .
NATURE, 2003, 423 (6937) :241-254
[9]   Gene finding in novel genomes [J].
Korf, I .
BMC BIOINFORMATICS, 2004, 5 (1)
[10]  
Kulp D, 1996, Proc Int Conf Intell Syst Mol Biol, V4, P134