Exon discovery by genomic sequence alignment

被引:33
作者
Morgenstern, B
Rinner, O
Abbeddaïm, S
Haase, D
Mayer, KFX
Dress, AWM
Mewes, HW
机构
[1] GSF Res Ctr, MIPS, Inst Bioinformat, D-85764 Neuherberg, Germany
[2] Univ Tubingen, Inst Chem Phys, D-72076 Tubingen, Germany
[3] Univ Rouen, Fac Sci & Tech, ABISS, LIFAR, F-76821 Mont St Aignan, France
[4] Univ Bielefeld, Res Ctr Interdisciplinary Studies Struct Format, D-33501 Bielefeld, Germany
关键词
D O I
10.1093/bioinformatics/18.6.777
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: During evolution, functional regions in genomic sequences tend to be more highly conserved than randomly mutating 'junk DNA' so local sequence similarity often indicates biological functionality. This fact can be used to identify functional elements in large eukaryotic DNA sequences by cross-species sequence comparison. In recent years, several gene-prediction methods have been proposed that work by comparing anonymous genomic sequences, for example from human and mouse. The main advantage of these methods is that they are based on simple and generally applicable measures of (local) sequence similarity; unlike standard gene-finding approaches they do not depend on species-specific training data or on the presence of cognate genes in data bases. As all comparative sequence-analysis methods, the new comparative gene-finding approaches critically rely on the quality of the underlying sequence alignments. Results: Herein, we describe a new implementation of the sequence-alignment program DIALIGN that has been developed for alignment of large genomic sequences. We compare our method to the alignment programs PipMaker, WABA and BLAST and we show that local similarities identified by these programs are highly correlated to protein-coding regions. In our test runs, PipMaker was the most sensitive method while DIALIGN was most specific.
引用
收藏
页码:777 / 787
页数:11
相关论文
共 53 条
  • [1] ABDEDDAIM S, 2001, LECT NOTES COMPUTER, V2066, P1
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [4] Ansari-Lari MA, 1998, GENOME RES, V8, P29
  • [5] Bafna V, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P3
  • [6] Human and mouse gene structure: Comparative analysis and application to exon prediction
    Batzoglou, S
    Pachter, L
    Mesirov, JP
    Berger, B
    Lander, ES
    [J]. GENOME RESEARCH, 2000, 10 (07) : 950 - 958
  • [7] Using GeneWise in the Drosophila annotation experiment
    Birney, E
    Durbin, R
    [J]. GENOME RESEARCH, 2000, 10 (04) : 547 - 548
  • [8] BRUDNO M, 2000, PAC S BIOC
  • [9] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [10] Finding the genes in genomic DNA
    Burge, CB
    Karlin, S
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) : 346 - 354