AliBiMotif: Integrating alignment and biclustering to unravel Transcription Factor Binding Sites in DNA sequences

被引:2
作者
Goncalves, Joana P. [1 ]
Moreau, Yves [2 ]
Madeira, Sara C. [1 ]
机构
[1] Univ Tecn Lisboa, Knowledge Discovery & Bioinformat KDBIO, INESC ID Comp Sci Dept, IST, P-1000029 Lisbon, Portugal
[2] KULeuven, Elect Engn Dept ESAT SCD, BIOI, B-3001 Heverlee, Belgium
关键词
biclustering; sequence alignment; motif finding algorithm; structured motif identification; binding site; TF; transcription factor; TFBS; cis-regulatory module discovery; promoter region; motif finder; integrative mining; STRUCTURED MOTIFS; IDENTIFICATION; ALGORITHM; PROMOTER;
D O I
10.1504/IJDMB.2012.048198
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Transcription Factors (TFs) control transcription by binding to specific sites in the promoter regions of the target genes, which can be modelled by structured motifs. In this paper we propose AliBiMotif, a method combining sequence alignment and a biclustering approach based on efficient string matching techniques using suffix trees to unravel approximately conserved sets of blocks (structured motifs) while straightforwardly disregarding non-conserved stretches in-between. The ability to ignore the width of non-conserved regions is a major advantage of the proposed method over other motif finders, as the lengths of the binding sites are usually easier to estimate than the separating distances.
引用
收藏
页码:196 / 215
页数:20
相关论文
共 15 条
[1]  
Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
[2]   Finding motifs using random projections [J].
Buhler, J ;
Tompa, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) :225-242
[3]   An efficient algorithm for the identification of structured motifs in DNA promoter sequences [J].
Carvalho, AM ;
Freitas, AT ;
Oliveira, AL ;
Sagot, MF .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2006, 3 (02) :126-140
[4]  
Eskin Eleazar, 2002, Bioinformatics, V18 Suppl 1, pS354
[5]   Identifying target sites for cooperatively binding factors [J].
GuhaThakurta, D ;
Stormo, GD .
BIOINFORMATICS, 2001, 17 (07) :608-621
[6]  
Gusfield Dan., 1997, Computer science and computational biology
[7]   DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT [J].
LAWRENCE, CE ;
ALTSCHUL, SF ;
BOGUSKI, MS ;
LIU, JS ;
NEUWALD, AF ;
WOOTTON, JC .
SCIENCE, 1993, 262 (5131) :208-214
[8]   A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series [J].
Madeira, Sara C. ;
Oliveira, Arlindo L. .
ALGORITHMS FOR MOLECULAR BIOLOGY, 2009, 4
[9]   Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification [J].
Marsan, L ;
Sagot, MF .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :345-362
[10]   MUSA:: a parameter free algorithm for the identification of biologically significant motifs [J].
Mendes, Nuno D. ;
Casimiro, Ana C. ;
Santos, Pedro M. ;
Sa-Correia, Isabel ;
Oliveira, Arlindo L. ;
Freitas, Ana T. .
BIOINFORMATICS, 2006, 22 (24) :2996-3002