Novel sequence-based method for identifying transcription factor binding sites in prokaryotic genomes

被引:21
作者
Sahota, Gurmukh [1 ]
Stormo, Gary D. [1 ]
机构
[1] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63108 USA
关键词
PROTEINS;
D O I
10.1093/bioinformatics/btq501
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Computational techniques for microbial genomic sequence analysis are becoming increasingly important. With next-generation sequencing technology and the human microbiome project underway, current sequencing capacity is significantly greater than the speed at which organisms of interest can be studied experimentally. Most related computational work has been focused on sequence assembly, gene annotation and metabolic network reconstruction. We have developed a method that will primarily use available sequence data in order to determine prokaryotic transcription factor (TF) binding specificities. Results: Specificity determining residues (critical residues) were identified from crystal structures of DNA-protein complexes and TFs with the same critical residues were grouped into specificity classes. The putative binding regions for each class were defined as the set of promoters for each TF itself (autoregulatory) and the immediately upstream and downstream operons. MEME was used to find putative motifs within each separate class. Tests on the LacI and TetR TF families, using RegulonDB annotated sites, showed the sensitivity of prediction 86% and 80%, respectively.
引用
收藏
页码:2672 / 2677
页数:6
相关论文
共 20 条
[1]   Regulog analysis:: Detection of conserved regulatory networks across bacteria:: Application to Staphylococcus aureus [J].
Alkema, WBL ;
Lenhard, B ;
Wasserman, WW .
GENOME RESEARCH, 2004, 14 (07) :1362-1373
[2]  
Bailey TL., 1994, Proc Int Conf Intel Syst Mol Biol, V2, P28
[3]   Comparative footprinting of DNA-binding proteins [J].
Contreras-Moreira, Bruno ;
Collado-Vides, Julio .
BIOINFORMATICS, 2006, 22 (14) :E74-E80
[4]   PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations [J].
Dolinsky, Todd J. ;
Czodrowski, Paul ;
Li, Hui ;
Nielsen, Jens E. ;
Jensen, Jan H. ;
Klebe, Gerhard ;
Baker, Nathan A. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W522-W525
[5]   Data Deposition and Annotation at the Worldwide Protein Data Bank [J].
Dutta, Shuchismita ;
Burkhardt, Kyle ;
Young, Jasmine ;
Swaminathan, Ganesh J. ;
Matsuura, Takanori ;
Henrick, Kim ;
Nakamura, Haruki ;
Berman, Helen M. .
MOLECULAR BIOTECHNOLOGY, 2009, 42 (01) :1-13
[6]  
Eddy Sean R, 2009, Genome Inform, V23, P205
[7]   The Pfam protein families database [J].
Finn, Robert D. ;
Mistry, Jaina ;
Tate, John ;
Coggill, Penny ;
Heger, Andreas ;
Pollington, Joanne E. ;
Gavin, O. Luke ;
Gunasekaran, Prasad ;
Ceric, Goran ;
Forslund, Kristoffer ;
Holm, Liisa ;
Sonnhammer, Erik L. L. ;
Eddy, Sean R. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D211-D222
[8]   RegulonDB (version 6.0):: gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation [J].
Gama-Castro, Socorro ;
Jimenez-Jacinto, Veronica ;
Peralta-Gil, Martin ;
Santos-Zavaleta, Alberto ;
Penaloza-Spinola, Monica I. ;
Contreras-Moreira, Bruno ;
Segura-Salazar, Juan ;
Muniz-Rascado, Luis ;
Martinez-Flores, Irma ;
Salgado, Heladia ;
Bonavides-Martinez, Cesar ;
Abreu-Goodger, Cei ;
Rodriguez-Penagos, Carlos ;
Miranda-Rios, Juan ;
Morett, Enrique ;
Merino, Enrique ;
Huerta, Araceli M. ;
Trevino-Quintanilla, Luis ;
Collado-Vides, Julio .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D120-D124
[9]  
Gelfand M S, 2000, Brief Bioinform, V1, P357, DOI 10.1093/bib/1.4.357
[10]   Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data [J].
Hamady, Micah ;
Lozupone, Catherine ;
Knight, Rob .
ISME JOURNAL, 2010, 4 (01) :17-27