SAGE2Splice:: Unmapped SAGE tags reveal novel splice junctions

被引:8
作者
Kuo, Byron Yu-Lin
Chen, Ying
Bohacec, Slavita
Johansson, Ojvind
Wasserman, Wyeth W.
Simpson, Elizabeth M. [1 ]
机构
[1] Univ British Columbia, Grad Program Genet, Vancouver, BC V5Z 1M9, Canada
[2] Univ British Columbia, Child & Family Res Inst, Ctr Mol Med & Therapeut, Dept Med Genet, Vancouver, BC V5Z 1M9, Canada
[3] Stockholm Bioinformat Ctr, Kunliga Tekniska Hogskolan, Stockholm, Sweden
关键词
D O I
10.1371/journal.pcbi.0020034
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Serial analysis of gene expression ( SAGE) not only is a method for profiling the global expression of genes, but also offers the opportunity for the discovery of novel transcripts. SAGE tags are mapped to known transcripts to determine the gene of origin. Tags that map neither to a known transcript nor to the genome were hypothesized to span a splice junction, for which the exon combination or exon(s) are unknown. To test this hypothesis, we have developed an algorithm, SAGE2Splice, to efficiently map SAGE tags to potential splice junctions in a genome. The algorithm consists of three search levels. A scoring scheme was designed based on position weight matrices to assess the quality of candidates. Using optimized parameters for SAGE2Splice analysis and two sets of SAGE data, candidate junctions were discovered for 5%-6% of unmapped tags. Candidates were classified into three categories, reflecting the previous annotations of the putative splice junctions. Analysis of predicted tags extracted from EST sequences demonstrated that candidate junctions having the splice junction located closer to the center of the tags are more reliable. Nine of these 12 candidates were validated by RT-PCR and sequencing, and among these, four revealed previously uncharacterized exons. Thus, SAGE2Splice provides a new functionality for the identification of novel transcripts and exons.
引用
收藏
页码:276 / 287
页数:12
相关论文
共 33 条
[1]  
Alberts B., 2002, Molecular Biology of The Cell, V4th
[2]   The new role of SAGE in gene discovery [J].
Boheler, KR ;
Stern, MD .
TRENDS IN BIOTECHNOLOGY, 2003, 21 (02) :55-57
[3]   An anatomy of normal and malignant gene expression [J].
Boon, K ;
Osório, EC ;
Greenhut, SF ;
Schaefer, CF ;
Shoemaker, J ;
Polyak, K ;
Morin, PJ ;
Buetow, KH ;
Strausberg, RL ;
de Souza, SJ ;
Riggins, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (17) :11287-11292
[4]  
BREATHNACH R, 1981, ANNU REV BIOCHEM, V50, P349, DOI 10.1146/annurev.bi.50.070181.002025
[5]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[6]   Analysis of canonical and non-canonical splice sites in mammalian genomes [J].
Burset, M ;
Seledtsov, IA ;
Solovyev, VV .
NUCLEIC ACIDS RESEARCH, 2000, 28 (21) :4364-4375
[7]  
Chen Jian-Jun, 2003, Methods Mol Biol, V221, P207
[8]   Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags [J].
Chen, JJ ;
Sun, M ;
Lee, SG ;
Zhou, GL ;
Rowley, JD ;
Wang, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (19) :12257-12262
[9]   Intron-exon structures of eukaryotic model organisms [J].
Deutsch, M ;
Long, M .
NUCLEIC ACIDS RESEARCH, 1999, 27 (15) :3219-3228
[10]   The mouse SAGE site: database of public mouse SAGE libraries [J].
Divina, P ;
Forejt, J .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D482-D483