AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification

被引:7
作者
Hiranuma, Naozumi [1 ]
Lundberg, Scott M. [1 ]
Lee, Su-In [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
INTEGRATIVE ANALYSIS; CHROMATIN-STATE; BINDING; NETWORK; DISCOVERY; ENCODE; SITES; CELLS;
D O I
10.1093/nar/gkz156
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
ChIP-seq is a technique to determine binding locations of transcription factors, which remains a central challenge in molecular biology. Current practice is to use a control' dataset to remove background signals from a immunoprecipitation (IP) target' dataset. We introduce the AIControl framework, which eliminates the need to obtain a control dataset and instead identifies binding peaks by estimating the distributions of background signals from many publicly available control ChIP-seq datasets. We thereby avoid the cost of running control experiments while simultaneously increasing the accuracy of binding location identification. Specifically, AIControl can (i) estimate background signals at fine resolution, (ii) systematically weigh the most appropriate control datasets in a data-driven way, (iii) capture sources of potential biases that may be missed by one control dataset and (iv) remove the need for costly and time-consuming control experiments. We applied AIControl to 410 IP datasets in the ENCODE ChIP-seq database, using 440 control datasets from 107 cell types to impute background signal. Without using matched control datasets, AIControl identified peaks that were more enriched for putative binding sites than those identified by other popular peak callers that used a matched control dataset. We also demonstrated that our framework identifies binding sites that recover documented protein interactions more accurately.
引用
收藏
页数:16
相关论文
共 46 条
[1]   Sequence and chromatin determinants of cell-type-specific transcription factor binding [J].
Arvey, Aaron ;
Agius, Phaedra ;
Noble, William Stafford ;
Leslie, Christina .
GENOME RESEARCH, 2012, 22 (09) :1723-1734
[2]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208
[3]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[4]   The genomic complexity of primary human prostate cancer [J].
Berger, Michael F. ;
Lawrence, Michael S. ;
Demichelis, Francesca ;
Drier, Yotam ;
Cibulskis, Kristian ;
Sivachenko, Andrey Y. ;
Sboner, Andrea ;
Esgueva, Raquel ;
Pflueger, Dorothee ;
Sougnez, Carrie ;
Onofrio, Robert ;
Carter, Scott L. ;
Park, Kyung ;
Habegger, Lukas ;
Ambrogio, Lauren ;
Fennell, Timothy ;
Parkin, Melissa ;
Saksena, Gordon ;
Voet, Douglas ;
Ramos, Alex H. ;
Pugh, Trevor J. ;
Wilkinson, Jane ;
Fisher, Sheila ;
Winckler, Wendy ;
Mahan, Scott ;
Ardlie, Kristin ;
Baldwin, Jennifer ;
Simons, Jonathan W. ;
Kitabayashi, Naoki ;
MacDonald, Theresa Y. ;
Kantoff, Philip W. ;
Chin, Lynda ;
Gabriel, Stacey B. ;
Gerstein, Mark B. ;
Golub, Todd R. ;
Meyerson, Matthew ;
Tewari, Ashutosh ;
Lander, Eric S. ;
Getz, Gad ;
Rubin, Mark A. ;
Garraway, Levi A. .
NATURE, 2011, 470 (7333) :214-220
[5]   Identification of β-catenin binding regions in colon cancer cells using ChIP-Seq [J].
Bottomly, Daniel ;
Kyler, Sydney L. ;
McWeeney, Shannon K. ;
Yochum, Gregory S. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (17) :5735-5745
[6]   Identification of novel NRF2-regulated genes by ChIP-Seq: influence on retinoid X receptor alpha [J].
Chorley, Brian N. ;
Campbell, Michelle R. ;
Wang, Xuting ;
Karaca, Mehmet ;
Sambandan, Deepa ;
Bangura, Fatu ;
Xue, Peng ;
Pi, Jingbo ;
Kleeberger, Steven R. ;
Bell, Douglas A. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (15) :7416-7429
[7]   Transcriptional regulation by hypoxia inducible factors [J].
Dengler, Veronica L. ;
Galbraith, Matthew D. ;
Espinosa, Joaquin M. .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 2014, 49 (01) :1-15
[8]   Normalization, bias correction, and peak calling for ChIP-seq [J].
Diaz, Aaron ;
Park, Kiyoub ;
Lim, Daniel A. ;
Song, Jun S. .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2012, 11 (03) :Article9
[9]   Integrative analysis of SF-1 transcription factor dosage impact on genome-wide binding and gene expression regulation [J].
Doghman, Mabrouka ;
Figueiredo, Bonald C. ;
Volante, Marco ;
Papotti, Mauro ;
Lalli, Enzo .
NUCLEIC ACIDS RESEARCH, 2013, 41 (19) :8896-8907
[10]   An integrated encyclopedia of DNA elements in the human genome [J].
Dunham, Ian ;
Kundaje, Anshul ;
Aldred, Shelley F. ;
Collins, Patrick J. ;
Davis, CarrieA. ;
Doyle, Francis ;
Epstein, Charles B. ;
Frietze, Seth ;
Harrow, Jennifer ;
Kaul, Rajinder ;
Khatun, Jainab ;
Lajoie, Bryan R. ;
Landt, Stephen G. ;
Lee, Bum-Kyu ;
Pauli, Florencia ;
Rosenbloom, Kate R. ;
Sabo, Peter ;
Safi, Alexias ;
Sanyal, Amartya ;
Shoresh, Noam ;
Simon, Jeremy M. ;
Song, Lingyun ;
Trinklein, Nathan D. ;
Altshuler, Robert C. ;
Birney, Ewan ;
Brown, James B. ;
Cheng, Chao ;
Djebali, Sarah ;
Dong, Xianjun ;
Dunham, Ian ;
Ernst, Jason ;
Furey, Terrence S. ;
Gerstein, Mark ;
Giardine, Belinda ;
Greven, Melissa ;
Hardison, Ross C. ;
Harris, Robert S. ;
Herrero, Javier ;
Hoffman, Michael M. ;
Iyer, Sowmya ;
Kellis, Manolis ;
Khatun, Jainab ;
Kheradpour, Pouya ;
Kundaje, Anshul ;
Lassmann, Timo ;
Li, Qunhua ;
Lin, Xinying ;
Marinov, Georgi K. ;
Merkel, Angelika ;
Mortazavi, Ali .
NATURE, 2012, 489 (7414) :57-74