PePr: a peak-calling prioritization pipeline to identify consistent or differential peaks from replicated ChIP-Seq data

被引:85
作者
Zhang, Yanxiao [1 ]
Lin, Yu-Hsuan [1 ]
Johnson, Timothy D. [2 ]
Rozek, Laura S. [3 ]
Sartor, Maureen A. [1 ,2 ]
机构
[1] Univ Michigan, Sch Publ Hlth, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Sch Publ Hlth, Dept Environm Hlth Sci, Ann Arbor, MI 48109 USA
基金
美国国家卫生研究院;
关键词
FACTOR-BINDING SITES; HISTONE MODIFICATIONS; TRANSCRIPTION FACTORS; EXPRESSION ANALYSIS; CELL LINES; ALGORITHM; IDENTIFICATION; ENRICHMENT; DOMAINS; REGIONS;
D O I
10.1093/bioinformatics/btu372
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: ChIP-Seq is the standard method to identify genome-wide DNA-binding sites for transcription factors (TFs) and histone modifications. There is a growing need to analyze experiments with biological replicates, especially for epigenomic experiments where variation among biological samples can be substantial. However, tools that can perform group comparisons are currently lacking. Results: We present a peak-calling prioritization pipeline (PePr) for identifying consistent or differential binding sites in ChIP-Seq experiments with biological replicates. PePr models read counts across the genome among biological samples with a negative binomial distribution and uses a local variance estimation method, ranking consistent or differential binding sites more favorably than sites with greater variability. We compared PePr with commonly used and recently proposed approaches on eight TF datasets and show that PePr uniquely identifies consistent regions with enriched read counts, high motif occurrence rate and known characteristics of TF binding based on visual inspection. For histone modification data with broadly enriched regions, PePr identified differential regions that are consistent within groups and outperformed other methods in scaling False Discovery Rate (FDR) analysis.
引用
收藏
页码:2568 / 2575
页数:8
相关论文
共 42 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]  
Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
[3]   Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data [J].
Blahnik, Kimberly R. ;
Dou, Lei ;
O'Geen, Henriette ;
McPhillips, Timothy ;
Xu, Xiaoqin ;
Cao, Alina R. ;
Iyengar, Sushma ;
Nicolet, Charles M. ;
Ludaescher, Bertram ;
Korf, Ian ;
Farnham, Peggy J. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (03) :e13.1-e13.17
[4]   F-Seq: a feature density estimator for high-throughput sequence tags [J].
Boyle, Alan P. ;
Guinney, Justin ;
Crawford, Gregory E. ;
Furey, Terrence S. .
BIOINFORMATICS, 2008, 24 (21) :2537-2538
[5]   Human Papillomavirus in Head and Neck Cancer: Its Role in Pathogenesis and Clinical Implications [J].
Chung, Christine H. ;
Gillison, Maura L. .
CLINICAL CANCER RESEARCH, 2009, 15 (22) :6758-6762
[6]  
Conte M, 2014, CANCER TREAT RES, V159, P227, DOI 10.1007/978-3-642-38007-5_13
[7]   Chromatin Signatures in Multipotent Human Hematopoietic Stem Cells Indicate the Fate of Bivalent Genes during Differentiation [J].
Cui, Kairong ;
Zang, Chongzhi ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Childs, Richard W. ;
Peng, Weiqun ;
Zhao, Keji .
CELL STEM CELL, 2009, 4 (01) :80-93
[8]   An integrated encyclopedia of DNA elements in the human genome [J].
Dunham, Ian ;
Kundaje, Anshul ;
Aldred, Shelley F. ;
Collins, Patrick J. ;
Davis, CarrieA. ;
Doyle, Francis ;
Epstein, Charles B. ;
Frietze, Seth ;
Harrow, Jennifer ;
Kaul, Rajinder ;
Khatun, Jainab ;
Lajoie, Bryan R. ;
Landt, Stephen G. ;
Lee, Bum-Kyu ;
Pauli, Florencia ;
Rosenbloom, Kate R. ;
Sabo, Peter ;
Safi, Alexias ;
Sanyal, Amartya ;
Shoresh, Noam ;
Simon, Jeremy M. ;
Song, Lingyun ;
Trinklein, Nathan D. ;
Altshuler, Robert C. ;
Birney, Ewan ;
Brown, James B. ;
Cheng, Chao ;
Djebali, Sarah ;
Dong, Xianjun ;
Dunham, Ian ;
Ernst, Jason ;
Furey, Terrence S. ;
Gerstein, Mark ;
Giardine, Belinda ;
Greven, Melissa ;
Hardison, Ross C. ;
Harris, Robert S. ;
Herrero, Javier ;
Hoffman, Michael M. ;
Iyer, Sowmya ;
Kellis, Manolis ;
Khatun, Jainab ;
Kheradpour, Pouya ;
Kundaje, Anshul ;
Lassmann, Timo ;
Li, Qunhua ;
Lin, Xinying ;
Marinov, Georgi K. ;
Merkel, Angelika ;
Mortazavi, Ali .
NATURE, 2012, 489 (7414) :57-74
[9]   FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology [J].
Fejes, Anthony P. ;
Robertson, Gordon ;
Bilenky, Mikhail ;
Varhol, Richard ;
Bainbridge, Matthew ;
Jones, Steven J. M. .
BIOINFORMATICS, 2008, 24 (15) :1729-1730
[10]   FIMO: scanning for occurrences of a given motif [J].
Grant, Charles E. ;
Bailey, Timothy L. ;
Noble, William Stafford .
BIOINFORMATICS, 2011, 27 (07) :1017-1018