MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data

被引:5
作者
Ozaki, Haruka [1 ,4 ]
Iwasaki, Wataru [1 ,2 ,3 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778568, Japan
[2] Univ Tokyo, Grad Sch Sci, Dept Biol Sci, Bunkyo Ku, Hongo 7-3-1, Tokyo 1130032, Japan
[3] Univ Tokyo, Atmosphere & Ocean Res Inst, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778564, Japan
[4] RIKEN, Adv Ctr Comp & Commun, Bioinformat Res Unit, 2-1 Hirosawa, Wako, Saitama 3510198, Japan
关键词
DNA binding motifs; ChIP-Seq; Transcription factors; SERUM RESPONSE FACTOR; TRANSCRIPTION-FACTOR; SEQUENCE; SITES; GENE; CREB; EXPRESSION; DISCOVERY; TRANSACTIVATION; ELEMENTS;
D O I
10.1016/j.compbiolchem.2016.01.014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Results: Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. Conclusions: By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 72
页数:11
相关论文
共 50 条
[1]   The role of regulatory variation in complex traits and disease [J].
Albert, Frank W. ;
Kruglyak, Leonid .
NATURE REVIEWS GENETICS, 2015, 16 (04) :197-212
[2]  
[Anonymous], 1994, MOL BIOL
[3]   Serum response factor is essential for mesoderm formation during mouse embryogenesis [J].
Arsenian, S ;
Weinhold, B ;
Oelgeschläger, M ;
Rüther, U ;
Nordheim, A .
EMBO JOURNAL, 1998, 17 (21) :6289-6299
[4]   DREME: motif discovery in transcription factor ChIP-seq data [J].
Bailey, Timothy L. .
BIOINFORMATICS, 2011, 27 (12) :1653-1659
[5]   GA-binding protein factors, in concert with the coactivator CREB binding protein p300, control the induction of the interleukin 16 promoter in T lymphocytes [J].
Bannert, N ;
Avots, A ;
Baier, M ;
Serfling, E ;
Kurth, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (04) :1541-1546
[6]  
Bembom O., SEQLOGO SEQUENCE LOG
[7]   DIFFERENT BINDING SPECIFICITIES AND TRANSACTIVATION OF VARIANT CRES BY CREB COMPLEXES [J].
BENBROOK, DM ;
JONES, NC .
NUCLEIC ACIDS RESEARCH, 1994, 22 (08) :1463-1469
[8]   Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities [J].
Berger, Michael F. ;
Philippakis, Anthony A. ;
Qureshi, Aaron M. ;
He, Fangxue S. ;
Estep, Preston W., III ;
Bulyk, Martha L. .
NATURE BIOTECHNOLOGY, 2006, 24 (11) :1429-1435
[9]   Ets ternary complex transcription factors [J].
Buchwalter, G ;
Gross, C ;
Wasylyk, B .
GENE, 2004, 324 :1-14
[10]   Serum response factor binding sites differ in three human cell types [J].
Cooper, Sara J. ;
Trinklein, Nathan D. ;
Nguyen, Loan ;
Myers, Richard M. .
GENOME RESEARCH, 2007, 17 (02) :136-144