Sigma: multiple alignment of weakly-conserved non-coding DNA sequence

被引:19
|
作者
Siddharthan, Rahul [1 ]
机构
[1] Inst Math Sci, Madras 600113, Tamil Nadu, India
关键词
D O I
10.1186/1471-2105-7-143
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Existing tools for multiple-sequence alignment focus on aligning protein sequence or protein-coding DNA sequence, and are often based on extensions to Needleman-Wunsch-like pairwise alignment methods. We introduce a new tool, Sigma, with a new algorithm and scoring scheme designed specifically for non-coding DNA sequence. This problem acquires importance with the increasing number of published sequences of closely-related species. In particular, studies of gene regulation seek to take advantage of comparative genomics, and recent algorithms for finding regulatory sites in phylogenetically-related intergenic sequence require alignment as a preprocessing step. Much can also be learned about evolution from intergenic DNA, which tends to evolve faster than coding DNA. Sigma uses a strategy of seeking the best possible gapless local alignments (a strategy earlier used by DiAlign), at each step making the best possible alignment consistent with existing alignments, and scores the significance of the alignment based on the lengths of the aligned fragments and a background model which may be supplied or estimated from an auxiliary file of intergenic DNA. Results: Comparative tests of sigma with five earlier algorithms on synthetic data generated to mimic real data show excellent performance, with Sigma balancing high "sensitivity" (more bases aligned) with effective filtering of "incorrect" alignments. With real data, while "correctness" can't be directly quantified for the alignment, running the PhyloGibbs motif finder on pre-aligned sequence suggests that Sigma's alignments are superior. Conclusion: By taking into account the peculiarities of non-coding DNA, Sigma fills a gap in the toolbox of bioinformatics.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Sigma: multiple alignment of weakly-conserved non-coding DNA sequence
    Rahul Siddharthan
    BMC Bioinformatics, 7
  • [2] Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model
    Gayathri Jayaraman
    Rahul Siddharthan
    BMC Bioinformatics, 11
  • [3] Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model
    Jayaraman, Gayathri
    Siddharthan, Rahul
    BMC BIOINFORMATICS, 2010, 11
  • [4] Unexpected conserved non-coding DNA blocks in mammals
    Gaffney, DJ
    Keightley, PD
    TRENDS IN GENETICS, 2004, 20 (08) : 332 - 337
  • [5] Enrichment of regulatory signals in conserved non-coding genomic sequence
    Levy, S
    Hannenhalli, S
    Workman, C
    BIOINFORMATICS, 2001, 17 (10) : 871 - 877
  • [6] Short sequence motifs, overrepresented in mammalian conserved non-coding sequences
    Minovitsky, Simon
    Stegmaier, Philip
    Kel, Alexander
    Kondrashov, Alexey S.
    Dubchak, Inna
    BMC GENOMICS, 2007, 8 (1)
  • [7] Regulation of IFNγ expression by a distal conserved non-coding sequence element
    Hatton, RD
    Luther, R
    Harrington, L
    Wakefield, T
    Weaver, CT
    FASEB JOURNAL, 2005, 19 (04): : A900 - A900
  • [8] Short sequence motifs, overrepresented in mammalian conserved non-coding sequences
    Simon Minovitsky
    Philip Stegmaier
    Alexander Kel
    Alexey S Kondrashov
    Inna Dubchak
    BMC Genomics, 8
  • [9] IDENTIFICATION OF A CONSERVED SEQUENCE IN THE NON-CODING REGIONS OF MANY HUMAN GENES
    DONEHOWER, LA
    SLAGLE, BL
    WILDE, M
    DARLINGTON, G
    BUTEL, JS
    NUCLEIC ACIDS RESEARCH, 1989, 17 (02) : 699 - 710
  • [10] Multiple alignment and structure prediction of non-coding RNA sequences
    Lindgreen, Stinus
    Gardner, Paul P.
    Krogh, Anders
    BMC BIOINFORMATICS, 2007, 8 (Suppl 8)