Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat

被引:20
作者
Dewey, C
Wu, JQ
Cawley, S
Alexandersson, M
Gibbs, R
Pachter, L [1 ]
机构
[1] Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Elect Engn, Berkeley, CA 94720 USA
[3] Affymetrix Inc, Emeryville, CA 94608 USA
[4] Fraunhofer Chalmers Ctr, SE-41288 Gothenburg, Sweden
[5] Baylor Coll Med, Human Genome Sequencing Ctr, Dept Mol & Human Genet, Houston, TX 77030 USA
关键词
D O I
10.1101/gr.1939804
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a new method for simultaneously identifying novel homologous genes with identical structure in the human, mouse, and rat genomes by combining pairwise predictions made with the SLAM gene-finding program. Using this method, we found 3698 gene triples in the human, mouse, and rat genomes which are predicted with exactly the same gene structure. We show, both computationally and experimentally, that the introns of these triples are predicted accurately as compared with the introns of other ab initio gene prediction sets. Computationally, we compared the introns of these gene triples, as well as those from other ab initio gene finders, with known intron annotations. We show that a unique property of SLAM, namely that it predicts gene structures simultaneously in two organisms, is key to producing sets of predictions that are highly accurate in intron structure when combined with other programs. Experimentally, we performed reverse transcription-polymerase chain reaction (RT-PCR) in both the human and rat to test the exon pairs flanking introns from a subset of the gene triples for which the human gene had not been previously identified. By performing RT-PCR on orthologous introns in both the human and rat genomes, we additionally explore the validity of using RT-PCR as a method for confirming gene predictions.
引用
收藏
页码:661 / 664
页数:4
相关论文
共 17 条
[1]   SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model [J].
Alexandersson, M ;
Cawley, S ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (03) :496-502
[2]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[3]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[4]   HMM sampling and applications to gene finding and alternative splicing [J].
Cawley, Simon L. ;
Pachter, Lior .
BIOINFORMATICS, 2003, 19 :II36-II41
[5]   Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes [J].
Guigó, R ;
Dermitzakis, ET ;
Agarwal, P ;
Ponting, CP ;
Parra, G ;
Reymond, A ;
Abril, JF ;
Keibler, E ;
Lyle, R ;
Ucla, C ;
Antonarakis, SE ;
Brent, MR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (03) :1140-1145
[6]   The human genome browser at UCSC [J].
Kent, WJ ;
Sugnet, CW ;
Furey, TS ;
Roskin, KM ;
Pringle, TH ;
Zahler, AM ;
Haussler, D .
GENOME RESEARCH, 2002, 12 (06) :996-1006
[7]  
KORF J, 2001, BIOINFORMATICS, V1, pS1
[8]   Current methods of gene prediction, their strengths and weaknesses [J].
Mathé, C ;
Sagot, MF ;
Schiex, T ;
Rouzé, P .
NUCLEIC ACIDS RESEARCH, 2002, 30 (19) :4103-4117
[9]   Alternative splicing in the human, mouse and rat genomes is associated with an increased frequency of exon creation and/or loss [J].
Modrek, B ;
Lee, CJ .
NATURE GENETICS, 2003, 34 (02) :177-180
[10]   Low conservation of alternative splicing patterns in the human and mouse genomes [J].
Nurtdinov, RN ;
Artamonova, II ;
Mironov, AA ;
Gelfand, MS .
HUMAN MOLECULAR GENETICS, 2003, 12 (11) :1313-1320