Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets using protocol-specific bias modeling

被引:55
作者
Calviello, Aslihan Karabacak [1 ,2 ]
Hirsekorn, Antje [1 ]
Wurmus, Ricardo [1 ]
Yusuf, Dilmurat [1 ]
Ohler, Uwe [1 ,2 ,3 ]
机构
[1] Berlin Inst Med Syst Biol, Max Delbruck Ctr Mol Med, Berlin, Germany
[2] Humboldt Univ, Dept Biol, Berlin, Germany
[3] Humboldt Univ, Dept Comp Sci, Berlin, Germany
关键词
ATAC-seq; DNase-seq; Footprinting; Bias correction; Reproducibility; FACTOR-BINDING SITES; READ ALIGNMENT; OPEN CHROMATIN; HYPERSENSITIVITY; SCALE;
D O I
10.1186/s13059-019-1654-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundDNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq.ResultsHere, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints.ConclusionsWe demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq.
引用
收藏
页数:13
相关论文
共 44 条
[31]   A transcription factor affinity-based code for mammalian transcription initiation [J].
Megraw, Molly ;
Pereira, Fernando ;
Jensen, Shane T. ;
Ohler, Uwe ;
Hatzigeorgiou, Artemis G. .
GENOME RESEARCH, 2009, 19 (04) :644-656
[32]   Reducing mitochondrial reads in ATAC-seq using CRISPR/Cas9 [J].
Montefiori, Lindsey ;
Hernandez, Liana ;
Zhang, Zijie ;
Gilad, Yoav ;
Ober, Carole ;
Crawford, Gregory ;
Nobrega, Marcelo ;
Sakabe, Noboru Jo .
SCIENTIFIC REPORTS, 2017, 7
[33]   An expansive human regulatory lexicon encoded in transcription factor footprints [J].
Neph, Shane ;
Vierstra, Jeff ;
Stergachis, Andrew B. ;
Reynolds, Alex P. ;
Haugen, Eric ;
Vernot, Benjamin ;
Thurman, Robert E. ;
John, Sam ;
Sandstrom, Richard ;
Johnson, Audra K. ;
Maurano, Matthew T. ;
Humbert, Richard ;
Rynes, Eric ;
Wang, Hao ;
Vong, Shinny ;
Lee, Kristen ;
Bates, Daniel ;
Diegel, Morgan ;
Roach, Vaughn ;
Dunn, Douglas ;
Neri, Jun ;
Schafer, Anthony ;
Hansen, R. Scott ;
Kutyavin, Tanya ;
Giste, Erika ;
Weaver, Molly ;
Canfield, Theresa ;
Sabo, Peter ;
Zhang, Miaohua ;
Balasundaram, Gayathri ;
Byron, Rachel ;
MacCoss, Michael J. ;
Akey, Joshua M. ;
Bender, M. A. ;
Groudine, Mark ;
Kaul, Rajinder ;
Stamatoyannopoulos, John A. .
NATURE, 2012, 489 (7414) :83-90
[34]   Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data [J].
Piper, Jason ;
Elze, Markus C. ;
Cauchy, Pierre ;
Cockerill, Peter N. ;
Bonifer, Constanze ;
Ott, Sascha .
NUCLEIC ACIDS RESEARCH, 2013, 41 (21) :e201
[35]   Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data [J].
Pique-Regi, Roger ;
Degner, Jacob F. ;
Pai, Athma A. ;
Gaffney, Daniel J. ;
Gilad, Yoav ;
Pritchard, Jonathan K. .
GENOME RESEARCH, 2011, 21 (03) :447-455
[36]   DeFCoM: analysis and modeling of transcription factor binding sites using a motif-centric genomic footprinter [J].
Quach, Bryan ;
Furey, Terrence S. .
BIOINFORMATICS, 2017, 33 (07) :956-963
[37]   msCentipede: Modeling Heterogeneity across Genomic Sites and Replicates Improves Accuracy in the Inference of Transcription Factor Binding [J].
Raj, Anil ;
Shim, Heejung ;
Gilad, Yoav ;
Pritchard, Jonathan K. ;
Stephens, Matthew .
PLOS ONE, 2015, 10 (09)
[38]   Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape [J].
Sherwood, Richard I. ;
Hashimoto, Tatsunori ;
O'Donnell, Charles W. ;
Lewis, Sophia ;
Barkal, Amira A. ;
van Hoff, John Peter ;
Karun, Vivek ;
Jaakkola, Tommi ;
Gifford, David K. .
NATURE BIOTECHNOLOGY, 2014, 32 (02) :171-+
[39]   ENCODE data at the ENCODE portal [J].
Sloan, Cricket A. ;
Chan, Esther T. ;
Davidson, Jean M. ;
Malladi, Venkat S. ;
Strattan, J. Seth ;
Hitz, Benjamin C. ;
Gabdank, Idan ;
Narayanan, Aditi K. ;
Ho, Marcus ;
Lee, Brian T. ;
Rowe, Laurence D. ;
Dreszer, Timothy R. ;
Roe, Greg ;
Podduturi, Nikhil R. ;
Tanaka, Forrest ;
Hong, Eurie L. ;
Cherry, J. Michael .
NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) :D726-D732
[40]   Transcription factors: from enhancer binding to developmental control [J].
Spitz, Francois ;
Furlong, Eileen E. M. .
NATURE REVIEWS GENETICS, 2012, 13 (09) :613-626