MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data

被引:5
|
作者
Ozaki, Haruka [1 ,4 ]
Iwasaki, Wataru [1 ,2 ,3 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778568, Japan
[2] Univ Tokyo, Grad Sch Sci, Dept Biol Sci, Bunkyo Ku, Hongo 7-3-1, Tokyo 1130032, Japan
[3] Univ Tokyo, Atmosphere & Ocean Res Inst, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778564, Japan
[4] RIKEN, Adv Ctr Comp & Commun, Bioinformat Res Unit, 2-1 Hirosawa, Wako, Saitama 3510198, Japan
关键词
DNA binding motifs; ChIP-Seq; Transcription factors; SERUM RESPONSE FACTOR; TRANSCRIPTION-FACTOR; SEQUENCE; SITES; GENE; CREB; EXPRESSION; DISCOVERY; TRANSACTIVATION; ELEMENTS;
D O I
10.1016/j.compbiolchem.2016.01.014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Results: Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. Conclusions: By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 72
页数:11
相关论文
共 50 条
  • [31] Python in ChIP-Seq data analysis
    Zhang, Li
    Hu, Yuansen
    Wang, Jinshui
    Zhang, Guangle
    Journal of Chemical and Pharmaceutical Research, 2014, 6 (03) : 1002 - 1007
  • [32] Normalization of ChIP-seq data with control
    Liang, Kun
    Keles, Sunduz
    BMC BIOINFORMATICS, 2012, 13
  • [33] Saturation analysis of ChIP-seq data for reproducible identification of binding peaks
    Hansen, Peter
    Hecht, Jochen
    Ibrahim, Daniel M.
    Krannich, Alexander
    Truss, Matthias
    Robinson, Peter N.
    GENOME RESEARCH, 2015, 25 (09) : 1391 - 1400
  • [34] Comparative study on ChIP-seq data: normalization and binding pattern characterization
    Taslim, Cenny
    Wu, Jiejun
    Yan, Pearlly
    Singer, Greg
    Parvin, Jeffrey
    Huang, Tim
    Lin, Shili
    Huang, Kun
    BIOINFORMATICS, 2009, 25 (18) : 2334 - 2340
  • [35] coMOTIF: a mixture framework for identifying transcription factor and a coregulator motif in ChIP-seq Data
    Xu, Mengyuan
    Weinberg, Clarice R.
    Umbach, David M.
    Li, Leping
    BIOINFORMATICS, 2011, 27 (19) : 2625 - 2632
  • [36] De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis
    Boeva, Valentina
    Surdez, Didier
    Guillon, Noelle
    Tirode, Franck
    Fejes, Anthony P.
    Delattre, Olivier
    Barillot, Emmanuel
    NUCLEIC ACIDS RESEARCH, 2010, 38 (11) : e126 - e126
  • [37] DNA-BINDING MOTIF
    PRENDERGAST, GC
    ZIFF, EB
    NATURE, 1989, 341 (6241) : 392 - 392
  • [38] WSMD: weakly-supervised motif discovery in transcription factor ChIP-seq data
    Hongbo Zhang
    Lin Zhu
    De-Shuang Huang
    Scientific Reports, 7
  • [39] WSMD: weakly-supervised motif discovery in transcription factor ChIP-seq data
    Zhang, Hongbo
    Zhu, Lin
    Huang, De-Shuang
    SCIENTIFIC REPORTS, 2017, 7
  • [40] Differential motif enrichment analysis of paired ChIP-seq experiments
    Tom Lesluyes
    James Johnson
    Philip Machanick
    Timothy L Bailey
    BMC Genomics, 15