SIOMICS: a novel approach for systematic identification of motifs in ChIP-seq data

被引:21
作者
Ding, Jun [1 ]
Hu, Haiyan [1 ]
Li, Xiaoman [2 ]
机构
[1] Univ Cent Florida, Dept Elect Engn & Comp Sci, Orlando, FL 32816 USA
[2] Univ Cent Florida, Burnett Sch Biomed Sci, Orlando, FL 32816 USA
基金
美国国家科学基金会;
关键词
DNA-BINDING-SITES; CIS-REGULATORY MODULES; HUMAN GENOME; CHROMATIN-IMMUNOPRECIPITATION; GENE-EXPRESSION; PREDICTION; DISCOVERY; PROFILES; ELEMENTS; NETWORK;
D O I
10.1093/nar/gkt1288
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The identification of transcription factor binding motifs is important for the study of gene transcriptional regulation. The chromatin immunoprecipitation (ChIP), followed by massive parallel sequencing (ChIP-seq) experiments, provides an unprecedented opportunity to discover binding motifs. Computational methods have been developed to identify motifs from ChIP-seq data, while at the same time encountering several problems. For example, existing methods are often not scalable to the large number of sequences obtained from ChIP-seq peak regions. Some methods heavily rely on well-annotated motifs even though the number of known motifs is limited. To simplify the problem, de novo motif discovery methods often neglect underrepresented motifs in ChIP-seq peak regions. To address these issues, we developed a novel approach called SIOMICS to de novo discover motifs from ChIP-seq data. Tested on 13 ChIP-seq data sets, SIOMICS identified motifs of many known and new cofactors. Tested on 13 simulated random data sets, SIOMICS discovered no motif in any data set. Compared with two recently developed methods for motif discovery, SIOMICS shows advantages in terms of speed, the number of known cofactor motifs predicted in experimental data sets and the number of false motifs predicted in random data sets. The SIOMICS software is freely available at http://eecs.ucf.edu/similar to xiaoman/ SIOMICS/SIOMICS.html.
引用
收藏
页数:9
相关论文
共 50 条
[31]   A Biclustering Algorithm to Discover Functional Modules from ENCODE ChIP-seq Data [J].
Wu, Chao ;
Bakshi, Arjun ;
Aronow, Bruce ;
Jegga, Anil ;
Bhatnagar, Raj .
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, :96-103
[32]   Assessing Computational Methods for Transcription Factor Target Gene Identification Based on ChIP-seq Data [J].
Sikora-Wohlfeld, Weronika ;
Ackermann, Marit ;
Christodoulou, Eleni G. ;
Singaravelu, Kalaimathy ;
Beyer, Andreas .
PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (11)
[33]   Simultaneous SNP identification and assessment of allele-specific bias from ChIP-seq data [J].
Ni, Yunyun ;
Hall, Amelia Weber ;
Battenhouse, Anna ;
Iyer, Vishwanath R. .
BMC GENETICS, 2012, 13
[34]   Accounting for immunoprecipitation efficiencies in the statistical analysis of ChIP-seq data [J].
Bao, Yanchun ;
Vinciotti, Veronica ;
Wit, Ernst ;
't Hoen, Peter A. C. .
BMC BIOINFORMATICS, 2013, 14
[35]   ChIP-Seq Data Analysis to Define Transcriptional Regulatory Networks [J].
Pavesi, Giulio .
NETWORK BIOLOGY, 2017, 160 :1-14
[36]   PscanChIP: finding over-represented transcription factor-binding site motifs and their correlations in sequences from ChIP-Seq experiments [J].
Zambelli, Federico ;
Pesole, Graziano ;
Pavesi, Giulio .
NUCLEIC ACIDS RESEARCH, 2013, 41 (W1) :W535-W543
[37]   An algorithmic perspective of de novo cis-regulatory motif finding based on ChIP-seq data [J].
Liu, Bingqiang ;
Yang, Jinyu ;
Li, Yang ;
McDermaid, Adam ;
Ma, Qin .
BRIEFINGS IN BIOINFORMATICS, 2018, 19 (05) :1069-1081
[38]   De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis [J].
Boeva, Valentina ;
Surdez, Didier ;
Guillon, Noelle ;
Tirode, Franck ;
Fejes, Anthony P. ;
Delattre, Olivier ;
Barillot, Emmanuel .
NUCLEIC ACIDS RESEARCH, 2010, 38 (11) :e126-e126
[39]   An effective approach for identification of in vivo protein-DNA binding sites from paired-end ChIP-Seq data [J].
Wang, Congmao ;
Xu, Jie ;
Zhang, Dasheng ;
Wilson, Zoe A. ;
Zhang, Dabing .
BMC BIOINFORMATICS, 2010, 11
[40]   iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data [J].
Yang, Chia-Chun ;
Andrews, Erik H. ;
Chen, Min-Hsuan ;
Wang, Wan-Yu ;
Chen, Jeremy J. W. ;
Gerstein, Mark ;
Liu, Chun-Chi ;
Cheng, Chao .
BMC GENOMICS, 2016, 17