FisherMP: fully parallel algorithm for detecting combinatorial motifs from large ChIP-seq datasets

被引:13
作者
Zhang, Shaoqiang [1 ]
Liang, Ying [1 ]
Wang, Xiangyun [1 ]
Su, Zhengchang [1 ,2 ]
Chen, Yong [3 ]
机构
[1] Tianjin Normal Univ, Coll Comp & Informat Engn, Tianjin 300387, Peoples R China
[2] Univ N Carolina, Dept Bioinformat & Genom, Charlotte, NC 28223 USA
[3] Univ Texas Dallas, Ctr Syst Biol, Dept Biol Sci, Richardson, TX 75080 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
combinatorial motifs; parallel algorithm; ChIP-seq; ESTROGEN-RECEPTOR-ALPHA; TRANSCRIPTION FACTOR; FUNCTIONAL ELEMENTS; BINDING SITES; DISCOVERY; IDENTIFICATION; GENOME; EXPRESSION; REGIONS; GROWTH;
D O I
10.1093/dnares/dsz004
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Detecting binding motifs of combinatorial transcription factors (TFs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments is an important and challenging computational problem for understanding gene regulations. Although a number of motif-finding algorithms have been presented, most are either time consuming or have sub-optimal accuracy for processing large-scale datasets. In this article, we present a fully parallelized algorithm for detecting combinatorial motifs from ChIP-seq datasets by using Fisher combined method and OpenMP parallel design. Large scale validations on both synthetic data and 350 ChIP-seq datasets from the ENCODE database showed that FisherMP has not only super speeds on large datasets, but also has high accuracy when compared with multiple popular methods. By using FisherMP, we successfully detected combinatorial motifs of CTCF, YY1, MAZ, STAT3 and USF2 in chromosome X, suggesting that they are functional co-players in gene regulation and chromosomal organization. Integrative and statistical analysis of these TF-binding peaks clearly demonstrate that they are not only highly coordinated with each other, but that they are also correlated with histone modifications. FisherMP can be applied for integrative analysis of binding motifs and for predicting cis-regulatory modules from a large number of ChIP-seq datasets.
引用
收藏
页码:231 / 242
页数:12
相关论文
共 53 条
[1]  
[Anonymous], 1948, Am. Stat.
[2]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208
[3]   DREME: motif discovery in transcription factor ChIP-seq data [J].
Bailey, Timothy L. .
BIOINFORMATICS, 2011, 27 (12) :1653-1659
[4]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[5]   METHOD FOR COMBINING NON-INDEPENDENT, ONE-SIDED TESTS OF SIGNIFICANCE [J].
BROWN, MB .
BIOMETRICS, 1975, 31 (04) :987-992
[6]   ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments [J].
Buck, MJ ;
Lieb, JD .
GENOMICS, 2004, 83 (03) :349-360
[7]   FastMotif: spectral sequence motif discovery [J].
Colombo, Nicolo ;
Vlassis, Nikos .
BIOINFORMATICS, 2015, 31 (16) :2623-2631
[8]   What are DNA sequence motifs? [J].
D'haeseleer, P .
NATURE BIOTECHNOLOGY, 2006, 24 (04) :423-425
[9]   ERegulation of the estrogen receptor a minimal promoter by Sp1, USF-1 and ERa [J].
deGraffenried, LA ;
Hopp, TA ;
Valente, AJ ;
Clark, RA ;
Fuqua, SAW .
BREAST CANCER RESEARCH AND TREATMENT, 2004, 85 (02) :111-120
[10]   Identification of a Ctcf cofactor, Yy1, for the X chromosome binary switch [J].
Donohoe, Mary E. ;
Zhang, Li-Feng ;
Xu, Na ;
Shi, Yang ;
Lee, Jeannie T. .
MOLECULAR CELL, 2007, 25 (01) :43-56