EID: the Exon-Intron Database - an exhaustive database of protein-coding intron-containing genes

被引:109
|
作者
Saxonov, S [1 ]
Daizadeh, I [1 ]
Fedorov, A [1 ]
Gilbert, W [1 ]
机构
[1] Harvard Univ, Dept Mol & Cellular Biol, Cambridge, MA 02138 USA
关键词
D O I
10.1093/nar/28.1.185
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
To aid studies of molecular evolution and to assist in gene prediction research, we have constructed an Exon-Intron Database (EID) in FASTA format. Currently, the database is derived from GenBank release 112, and it contains 51 289 protein-coding genes (287 209 exons) that harbor introns, along with extensive descriptions of each gene and its DNA and protein sequences, as well as splice motif information. There is 17% redundancy inherited from GenBank-a purge at the 99% identity level reduced the data-base to 42 460 genes (243 589 exons). We have created subdatabases of genes whose intron positions have been experimentally determined. One such database, constructed by comparing genomic and mRNA sequences, contains 11 242 genes (62 474 exons). A larger database of 22 196 genes (105 595 exons) was constructed by selecting on keywords to eliminate computer-predicted genes. By examining the two nucleotides adjacent to the intron boundary, we infer that there is a 2% rate of errors or other deviations from the standard GT...AG motif in nuclear genes. This criterion can be used to eliminate 4921 genes from the overall database. Various tools are provided to enable generation of user-specific subsets of the EID. The EID distribution can be obtained from http:/mcb.harvard.edu/gilbert/EID.
引用
收藏
页码:185 / 190
页数:6
相关论文
共 50 条
  • [41] AluGene:: a database of Alu elements incorporated within protein-coding genes
    Dagan, T
    Sorek, R
    Sharon, E
    Ast, G
    Graur, D
    NUCLEIC ACIDS RESEARCH, 2004, 32 : D489 - D492
  • [42] A cautionary note for retrocopy identification: DNA-based duplication of intron-containing genes significantly contributes to the origination of single exon genes
    Zhang, Yong E.
    Vibranovski, Maria D.
    Krinsky, Benjamin H.
    Long, Manyuan
    BIOINFORMATICS, 2011, 27 (13) : 1749 - 1753
  • [43] Biochemical analysis of TREX complex recruitment to intronless and intron-containing yeast genes
    Abruzzi, KC
    Lacadie, S
    Rosbash, M
    EMBO JOURNAL, 2004, 23 (13): : 2620 - 2631
  • [44] 5-hmC in the brain is abundant in synaptic genes and shows differences at the exon-intron boundary
    Khare, Tarang
    Pai, Shraddha
    Koncevicius, Karolis
    Pal, Mrinal
    Kriukiene, Edita
    Liutkeviciute, Zita
    Irimia, Manuel
    Jia, Peixin
    Ptak, Carolyn
    Xia, Menghang
    Tice, Raymond
    Tochigi, Mamoru
    Morera, Solange
    Nazarians, Anaies
    Belsham, Denise
    Wong, Albert H. C.
    Blencowe, Benjamin J.
    Wang, Sun Chong
    Kapranov, Philipp
    Kustra, Rafal
    Labrie, Viviane
    Klimasauskas, Saulius
    Petronis, Arturas
    NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2012, 19 (10) : 1037 - U94
  • [45] Cotranscriptional recruitment of the U1 snRNP to intron-containing genes in yeast
    Kotovic, KM
    Lockshon, D
    Boric, L
    Neugebauer, KM
    MOLECULAR AND CELLULAR BIOLOGY, 2003, 23 (16) : 5768 - 5779
  • [46] A handful of intron-containing genes produces the lion's share of yeast mRNA
    Ares, M
    Grate, L
    Pauling, MH
    RNA, 1999, 5 (09) : 1138 - 1139
  • [47] EXON-INTRON ORGANIZATION OF XENOPUS MHC CLASS-II BETA-CHAIN GENES
    KOBARI, F
    SATO, K
    SHUM, BP
    TOCHINAI, S
    KATAGIRI, M
    ISHIBASHI, T
    DUPASQUIER, L
    FLAJNIK, MF
    KASAHARA, M
    IMMUNOGENETICS, 1995, 42 (05) : 376 - 385
  • [48] EvolMarkers: a database for mining exon and intron markers for evolution, ecology and conservation studies
    Li, Chenhong
    Riethoven, Jean-Jack M.
    Naylor, Gavin J. P.
    MOLECULAR ECOLOGY RESOURCES, 2012, 12 (05) : 967 - 971
  • [49] The Exon Junction Complex Controls the Splicing of mapk and Other Long Intron-Containing Transcripts in Drosophila
    Ashton-Beaucage, Dariel
    Udell, Christian M.
    Lavoie, Hugo
    Baril, Caroline
    Lefrancois, Martin
    Chagnon, Pierre
    Gendron, Patrick
    Caron-Lizotte, Olivier
    Bonneil, Eric
    Thibault, Pierre
    Therrien, Marc
    CELL, 2010, 143 (02) : 251 - 262
  • [50] RODENT AND HUMAN BETA(3)-ADRENERGIC RECEPTOR GENES CONTAIN AN INTRON WITHIN THE PROTEIN-CODING BLOCK
    GRANNEMAN, JG
    LAHNERS, KN
    RAO, DD
    MOLECULAR PHARMACOLOGY, 1992, 42 (06) : 964 - 970