Using Graph Modularity Analysis to Identify Transcription Factor Binding Sites

被引:0
作者
Francisco, Alexandre P. [1 ]
Schbath, Sophie [2 ]
Freitas, Ana T. [1 ]
Oliveira, Arlindo L. [1 ]
机构
[1] INESC ID IST, Lisbon, Portugal
[2] INRA, Jouy En Josas, France
来源
2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW) | 2010年
关键词
graph modularity analysis; binding sites; MICROARRAY ANALYSIS; STRUCTURED MOTIFS; IDENTIFICATION; ACTIVATION; DISCOVERY; ALGORITHM; PROMOTER;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite the remarkable success of computational biology methods in some areas of application like gene finding and sequence alignment, there are still topics for which no definitive approaches have been proposed. One of these is the accurate detection of biologically significant cis-regulatory motifs, that remains an open problem, despite intensive research in the field. Probabilistic motif finders are most popular, mainly because combinatorial motif finders generate extensive and hard to understand lists of potential motifs. In this work, we present Needle, a method for de novo motif discovery that works by post-processing the output of a combinatorial motif finder, using graph analysis techniques. The method is based on the identification of highly connected modules in the graph that is obtained by connecting the nodes that correspond to motifs if these motifs are co-located in the sequences under analysis. We have tested this method against several well known motif finders, using a set of recently published large-scale compendium of transcription factors, derived from diverse high-throughput experiments in several metazoan. Preliminary results show that the method is highly competitive with state of the art methods that use much more extensive information. We expect that future versions of the algorithm, that will include a number of improvements, will become one of the methods of choice to identify significant cis-regulatory motifs that include only a small conserved core.
引用
收藏
页码:19 / 26
页数:8
相关论文
共 31 条
[1]  
Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
[2]   Finding motifs using random projections [J].
Buhler, J ;
Tompa, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) :225-242
[3]   An efficient algorithm for the identification of structured motifs in DNA promoter sequences [J].
Carvalho, AM ;
Freitas, AT ;
Oliveira, AL ;
Sagot, MF .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2006, 3 (02) :126-140
[4]  
Clauset A, 2004, PHYS REV E, V70, DOI 10.1103/PhysRevE.70.066111
[5]   Discrimination between paralogs using microarray analysis: Application to the Yap1p and Yap2p transcriptional networks [J].
Cohen, BA ;
Pilpel, Y ;
Mitra, RD ;
Church, GM .
MOLECULAR BIOLOGY OF THE CELL, 2002, 13 (05) :1608-1614
[6]   Direct activation of genes involved in intracellular iron use by the yeast iron-responsive transcription factor Aft2 without its paralog Aft1 [J].
Courel, M ;
Lallet, S ;
Camadro, JM ;
Blaiseau, PL .
MOLECULAR AND CELLULAR BIOLOGY, 2005, 25 (15) :6760-6771
[7]   WebLogo: A sequence logo generator [J].
Crooks, GE ;
Hon, G ;
Chandonia, JM ;
Brenner, SE .
GENOME RESEARCH, 2004, 14 (06) :1188-1190
[8]   A survey of DNA motif finding algorithms [J].
Das, Modan K. ;
Dai, Ho-Kwok .
BMC BIOINFORMATICS, 2007, 8 (Suppl 7)
[9]   Genome microarray analysis of transcriptional activation in multidrug resistance yeast mutants [J].
DeRisi, J ;
van den Hazel, B ;
Marc, P ;
Balzi, E ;
Brown, P ;
Jacq, C ;
Goffeau, A .
FEBS LETTERS, 2000, 470 (02) :156-160
[10]   Community structure in social and biological networks [J].
Girvan, M ;
Newman, MEJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (12) :7821-7826