MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data

被引:5
作者
Ozaki, Haruka [1 ,4 ]
Iwasaki, Wataru [1 ,2 ,3 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778568, Japan
[2] Univ Tokyo, Grad Sch Sci, Dept Biol Sci, Bunkyo Ku, Hongo 7-3-1, Tokyo 1130032, Japan
[3] Univ Tokyo, Atmosphere & Ocean Res Inst, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778564, Japan
[4] RIKEN, Adv Ctr Comp & Commun, Bioinformat Res Unit, 2-1 Hirosawa, Wako, Saitama 3510198, Japan
关键词
DNA binding motifs; ChIP-Seq; Transcription factors; SERUM RESPONSE FACTOR; TRANSCRIPTION-FACTOR; SEQUENCE; SITES; GENE; CREB; EXPRESSION; DISCOVERY; TRANSACTIVATION; ELEMENTS;
D O I
10.1016/j.compbiolchem.2016.01.014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Results: Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. Conclusions: By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:62 / 72
页数:11
相关论文
共 50 条
[41]  
2-4
[42]  
Valouev A, 2008, NAT METHODS, V5, P829, DOI [10.1038/nmeth.1246, 10.1038/NMETH.1246]
[43]   Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors [J].
Wang, Jie ;
Zhuang, Jiali ;
Iyer, Sowmya ;
Lin, XinYing ;
Whitfield, Troy W. ;
Greven, Melissa C. ;
Pierce, Brian G. ;
Dong, Xianjun ;
Kundaje, Anshul ;
Cheng, Yong ;
Rando, Oliver J. ;
Birney, Ewan ;
Myers, Richard M. ;
Noble, William S. ;
Snyder, Michael ;
Weng, Zhiping .
GENOME RESEARCH, 2012, 22 (09) :1798-1812
[44]   INTEGRATION OF MAP KINASE SIGNAL-TRANSDUCTION PATHWAYS AT THE SERUM RESPONSE ELEMENT [J].
WHITMARSH, AJ ;
SHORE, P ;
SHARROCKS, AD ;
DAVIS, RJ .
SCIENCE, 1995, 269 (5222) :403-407
[45]   GA binding protein regulates interleukin 7 receptor α-chain gene expression in T cells [J].
Xue, HH ;
Bollenbacher, J ;
Rovella, V ;
Tripuraneni, R ;
Du, YB ;
Liu, CY ;
Williams, A ;
McCoy, JP ;
Leonard, WJ .
NATURE IMMUNOLOGY, 2004, 5 (10) :1036-1044
[46]   MICROPHTHALMIA-ASSOCIATED TRANSCRIPTION FACTOR AS A REGULATOR FOR MELANOCYTE-SPECIFIC TRANSCRIPTION OF THE HUMAN TYROSINASE GENE [J].
YASUMOTO, KI ;
YOKOYAMA, K ;
SHIBATA, K ;
TOMITA, Y ;
SHIBAHARA, S .
MOLECULAR AND CELLULAR BIOLOGY, 1994, 14 (12) :8058-8070
[47]   A noncanonical E-box enhancer drives mouse Period2 circadian oscillations in vivo [J].
Yoo, SH ;
Ko, CH ;
Lowrey, PL ;
Buhr, ED ;
Song, EJ ;
Chang, SW ;
Yoo, OJ ;
Yamazaki, S ;
Lee, C ;
Takahashi, JS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (07) :2608-2613
[48]   CLOCK-Controlled Polyphonic Regulation of Circadian Rhythms through Canonical and Noncanonical E-Boxes [J].
Yoshitane, Hikari ;
Ozaki, Haruka ;
Terajima, Hideki ;
Du, Ngoc-Hien ;
Suzuki, Yutaka ;
Fujimori, Taihei ;
Kosaka, Naoki ;
Shimba, Shigeki ;
Sugano, Sumio ;
Takagi, Toshihisa ;
Iwasaki, Wataru ;
Fukada, Yoshitaka .
MOLECULAR AND CELLULAR BIOLOGY, 2014, 34 (10) :1776-1787
[49]   Motif discovery and transcription factor binding sites before and after the next-generation sequencing era [J].
Zambelli, Federico ;
Pesole, Graziano ;
Pavesi, Giulio .
BRIEFINGS IN BIOINFORMATICS, 2013, 14 (02) :225-237
[50]   Simultaneously Learning DNA Motif Along with Its Position and Sequence Rank Preferences Through Expectation Maximization Algorithm [J].
Zhang, Zhizhuo ;
Chang, Cheng Wei ;
Hugo, Willy ;
Cheung, Edwin ;
Sung, Wing-Kin .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (03) :237-248