EID: the Exon-Intron Database - an exhaustive database of protein-coding intron-containing genes

被引:109
|
作者
Saxonov, S [1 ]
Daizadeh, I [1 ]
Fedorov, A [1 ]
Gilbert, W [1 ]
机构
[1] Harvard Univ, Dept Mol & Cellular Biol, Cambridge, MA 02138 USA
关键词
D O I
10.1093/nar/28.1.185
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
To aid studies of molecular evolution and to assist in gene prediction research, we have constructed an Exon-Intron Database (EID) in FASTA format. Currently, the database is derived from GenBank release 112, and it contains 51 289 protein-coding genes (287 209 exons) that harbor introns, along with extensive descriptions of each gene and its DNA and protein sequences, as well as splice motif information. There is 17% redundancy inherited from GenBank-a purge at the 99% identity level reduced the data-base to 42 460 genes (243 589 exons). We have created subdatabases of genes whose intron positions have been experimentally determined. One such database, constructed by comparing genomic and mRNA sequences, contains 11 242 genes (62 474 exons). A larger database of 22 196 genes (105 595 exons) was constructed by selecting on keywords to eliminate computer-predicted genes. By examining the two nucleotides adjacent to the intron boundary, we infer that there is a 2% rate of errors or other deviations from the standard GT...AG motif in nuclear genes. This criterion can be used to eliminate 4921 genes from the overall database. Various tools are provided to enable generation of user-specific subsets of the EID. The EID distribution can be obtained from http:/mcb.harvard.edu/gilbert/EID.
引用
收藏
页码:185 / 190
页数:6
相关论文
共 50 条
  • [1] Advances in the Exon-Intron Database (EID)
    Shepelev, Valery
    Fedorov, Alexei
    BRIEFINGS IN BIOINFORMATICS, 2006, 7 (02) : 178 - 185
  • [2] ExDom: an integrated database for comparative analysis of the exon-intron structures of protein domains in eukaryotes
    Bhasi, Ashwini
    Philip, Philge
    Manikandan, Vinu
    Senapathy, Periannan
    NUCLEIC ACIDS RESEARCH, 2009, 37 : D703 - D711
  • [3] ExInt: an Exon Intron Database
    Sakharkar, M
    Passetti, F
    de Souza, JE
    Long, M
    de Souza, SJ
    NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 191 - 194
  • [4] ExInt: an Exon/Intron database
    Sakharkar, M
    Long, M
    Tan, TW
    de Souza, SJ
    NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 191 - 192
  • [5] Divergence of duplicate genes in exon-intron structure
    Xu, Guixia
    Guo, Chunce
    Shan, Hongyan
    Kong, Hongzhi
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (04) : 1187 - 1192
  • [6] Exon-intron organization of TRGC genes in sheep
    Miccoli, MC
    Lipsi, MR
    Massari, S
    Lanave, C
    Ciccarese, S
    IMMUNOGENETICS, 2001, 53 (05) : 416 - 422
  • [7] Exon-intron organization of TRGC genes in sheep
    Maria C. Miccoli
    Maria R. Lipsi
    Serafina Massari
    Cecilia Lanave
    Salvatrice Ciccarese
    Immunogenetics, 2001, 53 : 416 - 422
  • [8] EXON-INTRON ORGANIZATION IN GENES OF EARTHWORM AND VERTEBRATE GLOBINS
    JHIANG, SM
    GAREY, JR
    RIGGS, AF
    SCIENCE, 1988, 240 (4850) : 334 - 336
  • [9] Analysis of evolution of exon-intron structure of eukaryotic genes
    Rogozin, IB
    Sverdlov, AV
    Babenko, VN
    Koonin, EV
    BRIEFINGS IN BIOINFORMATICS, 2005, 6 (02) : 118 - 134
  • [10] Exon-intron structure of genes in complete fungal genomes
    Ivashchenko, A. T.
    Tauasarova, M. I.
    Atambayeva, Sh. A.
    MOLECULAR BIOLOGY, 2009, 43 (01) : 24 - 31