Genome SEGE: A database for 'intronless' genes in eukaryotic genomes

被引:39
作者
Sakharkar, MK [1 ]
Kangueane, P [1 ]
机构
[1] Nanyang Technol Univ, Sch Mech & Prod Engn, Nanyang Ctr Supercomp & Visualizat, Singapore 639798, Singapore
关键词
D O I
10.1186/1471-2105-5-67
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed. Availability: http://sege.ntu.edu.sg/wester/intronless Description: Eukaryotic 'intronless' genes are extracted from nine completely sequenced genomes ( four of which are unicellular and five of which are multi-cellular). The complete dataset is available for download. Data subsets are also available for 'intronless' pseudo-genes. The database provides information on the distribution of 'intronless' genes in different genomes together with their length distributions in each genome. Additionally, the search tool provides pre-computed PROSITE motifs for each sequence in the database with appropriate hyperlinks to InterPro. A search facility is also available through the web server. Conclusions: The unique features that distinguish Genome SEGE from SEGE is the service providing representative 'intronless' datasets for completely sequenced genomes. 'Intronless' gene sets available in this database will be of use for subsequent bio-computational analysis in comparative genomics and evolutionary studies. Such analysis may help to revisit the original genome data for re-examination and re-annotation.
引用
收藏
页数:5
相关论文
共 11 条
  • [1] Many G-protein-coupled receptors are encoded by retrogenes
    Brosius, J
    [J]. TRENDS IN GENETICS, 1999, 15 (08) : 304 - 305
  • [2] Genomes were forged by massive bombardments with retroelements and retrosequences
    Brosius, J
    [J]. GENETICA, 1999, 107 (1-3) : 209 - 238
  • [3] The PROSITE database, its status in 2002
    Falquet, L
    Pagni, M
    Bucher, P
    Hulo, N
    Sigrist, CJA
    Hofmann, K
    Bairoch, A
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 235 - 238
  • [4] PSEUDOGENES IN YEAST
    FINK, GR
    [J]. CELL, 1987, 49 (01) : 5 - 6
  • [5] Why are human G-protein-coupled receptors predominantly intronless?
    Gentles, AJ
    Karlin, S
    [J]. TRENDS IN GENETICS, 1999, 15 (02) : 47 - 49
  • [6] Molecular fossils in the human genome: Identification and analysis of the pseudogenes in chromosomes 21 and 22
    Harrison, PM
    Hegyi, H
    Balasubramanian, S
    Luscombe, NM
    Bertone, P
    Echols, N
    Johnson, T
    Gerstein, M
    [J]. GENOME RESEARCH, 2002, 12 (02) : 272 - 280
  • [7] Mollapour M, 2001, YEAST, V18, P173
  • [8] The InterPro Database, 2003 brings increased coverage and new features
    Mulder, NJ
    Apweiler, R
    Attwood, TK
    Bairoch, A
    Barrell, D
    Bateman, A
    Binns, D
    Biswas, M
    Bradley, P
    Bork, P
    Bucher, P
    Copley, RR
    Courcelle, E
    Das, U
    Durbin, R
    Falquet, L
    Fleischmann, W
    Griffiths-Jones, S
    Haft, D
    Harte, N
    Hulo, N
    Kahn, D
    Kanapin, A
    Krestyaninova, M
    Lopez, R
    Letunic, I
    Lonsdale, D
    Silventoinen, V
    Orchard, SE
    Pagni, M
    Peyruc, D
    Ponting, CP
    Selengut, JD
    Servant, F
    Sigrist, CJA
    Vaughan, R
    Zdobnov, EM
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 315 - 318
  • [9] HISTONE GENES - NOT SO SIMPLE AFTER ALL
    OLD, RW
    WOODLAND, HR
    [J]. CELL, 1984, 38 (03) : 624 - 626
  • [10] SEGE: A database on 'intron less/single exonic' genes from eukaryotes
    Sakharkar, MK
    Kangueane, P
    Petrov, DA
    Kolaskar, AS
    Subbiah, S
    [J]. BIOINFORMATICS, 2002, 18 (09) : 1266 - 1267