A new framework for identifying cis-regulatory motifs in prokaryotes

被引:29
|
作者
Li, Guojun [1 ,2 ,3 ]
Liu, Bingqiang [1 ,2 ,3 ]
Ma, Qin [1 ,2 ,3 ]
Xu, Ying [1 ,2 ,4 ]
机构
[1] Univ Georgia, Dept Biochem & Mol Biol, Computat Syst Biol Lab, Athens, GA 30602 USA
[2] Univ Georgia, Inst Bioinformat, Athens, GA 30602 USA
[3] Shandong Univ, Sch Math, Jinan 250100, Peoples R China
[4] Jilin Univ, Coll Comp Sci & Technol, Changchun 130023, Jilin, Peoples R China
基金
美国国家科学基金会;
关键词
FACTOR-BINDING SITES; GAMMA-PROTEOBACTERIAL GENOMES; ESCHERICHIA-COLI; TRACTOR-DB; DNA; TRANSCRIPTION; DISCOVERY; SEQUENCES; DATABASE; PROTEIN;
D O I
10.1093/nar/gkq948
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a new algorithm, BOBRO, for prediction of cis-regulatory motifs in a given set of promoter sequences. The algorithm substantially improves the prediction accuracy and extends the scope of applicability of the existing programs based on two key new ideas: (i) we developed a highly effective method for reliably assessing the possibility for each position in a given promoter to be the (approximate) start of a conserved sequence motif; and (ii) we developed a highly reliable way for recognition of actual motifs from the accidental ones based on the concept of 'motif closure'. These two key ideas are embedded in a classical framework for motif finding through finding cliques in a graph but have made this framework substantially more sensitive as well as more selective in motif finding in a very noisy background. A comparative analysis shows that the performance coefficient was improved from 29% to 41% by our program compared to the best among other six state-of-the-art prediction tools on a large-scale data sets of promoters from one genome, and also consistently improved by substantial margins on another kind of large-scale data sets of orthologous promoters across multiple genomes. The power of BOBRO in dealing with noisy data was further demonstrated through identification of the motifs of the global transcriptional regulators by running it over 2390 promoter sequences of Escherichia coli K12.
引用
收藏
页码:E42 / U54
页数:9
相关论文
共 50 条
  • [1] An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes
    Liu, Bingqiang
    Zhang, Hanyuan
    Zhou, Chuan
    Li, Guojun
    Fennell, Anne
    Wang, Guanghui
    Kang, Yu
    Liu, Qi
    Ma, Qin
    BMC GENOMICS, 2016, 17
  • [2] Accurate recognition of cis-regulatory motifs with the correct lengths in prokaryotic genomes
    Li, Guojun
    Liu, Bingqiang
    Xu, Ying
    NUCLEIC ACIDS RESEARCH, 2010, 38 (02) : e12.1 - e12.7
  • [3] An integrated toolkit for accurate prediction and analysis of cis-regulatory motifs at a genome scale
    Ma, Qin
    Liu, Bingqiang
    Zhou, Chuan
    Yin, Yanbin
    Li, Guojun
    Xu, Ying
    BIOINFORMATICS, 2013, 29 (18) : 2261 - 2268
  • [4] A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model
    Guo, Haitao
    Huo, Hongwei
    BIOMED RESEARCH INTERNATIONAL, 2017, 2017
  • [5] Finding evolutionarily conserved cis-regulatory modules with a universal set of motifs
    Wilczynski, Bartek
    Dojer, Norbert
    Patelak, Mateusz
    Tiuryn, Jerzy
    BMC BIOINFORMATICS, 2009, 10
  • [6] cis-Regulatory elements in plant cell signaling
    Priest, Henry D.
    Filichkin, Sergei A.
    Mockler, Todd C.
    CURRENT OPINION IN PLANT BIOLOGY, 2009, 12 (05) : 643 - 649
  • [7] MotifClick: prediction of cis-regulatory binding sites via merging cliques
    Zhang, Shaoqiang
    Li, Shan
    Niu, Meng
    Pham, Phuc T.
    Su, Zhengchang
    BMC BIOINFORMATICS, 2011, 12
  • [8] Analysis of Cis-Regulatory Motifs in Cassette Exons by Incorporating Exon Skipping Rates
    Zhao, Sihui
    Kim, Jihye
    Heber, Steffen
    BIOINFORMATICS RESEARCH AND APPLICATIONS: 5TH INTERNATIONAL SYMPOSIUM, ISBRA 2009, 2009, 5542 : 272 - 283
  • [9] Identifying Cis-Regulatory Elements and Modules Using Conditional Random Fields
    Gan, Yanglan
    Guan, Jihong
    Zhou, Shuigeng
    Zhang, Weixiong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (01) : 73 - 82
  • [10] Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
    Rouault, Herve
    Santolini, Marc
    Schweisguth, Francois
    Hakim, Vincent
    NUCLEIC ACIDS RESEARCH, 2014, 42 (10) : 6128 - 6145