HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis

被引:523
作者
Kulakovskiy, Ivan V. [1 ,2 ,3 ]
Vorontsov, Ilya E. [2 ]
Yevshin, Ivan S. [4 ]
Sharipov, Ruslan N. [4 ,5 ,6 ]
Fedorova, Alla D. [7 ]
Rumynskiy, Eugene I. [2 ,8 ]
Medvedeva, Yulia A. [2 ,8 ,9 ]
Magana-Mora, Arturo [10 ,11 ]
Bajic, Vladimir B. [11 ]
Papatsenko, Dmitry A. [3 ]
Kolpakov, Fedor A. [4 ,5 ]
Makeev, Vsevolod J. [1 ,2 ,8 ]
机构
[1] Russian Acad Sci, Engelhardt Inst Mol Biol, GSP-1,Vavilova 32, Moscow 119991, Russia
[2] Russian Acad Sci, Vavilov Inst Gen Genet, GSP-1,Gubkina 3, Moscow 119991, Russia
[3] Skolkovo Inst Sci & Technol, Ctr Data Intens Biomed & Biotechnol, Moscow 143026, Russia
[4] BIOSOFT RU Ltd, Russkaya 41-1, Novosibirsk 630058, Russia
[5] Russian Acad Sci, Siberian Branch, Inst Computat Technol, Akad Rzhanova 6, Novosibirsk 630090, Russia
[6] Novosibirsk State Univ, Pirogova 2, Novosibirsk 630090, Russia
[7] Lomonosov Moscow State Univ, Fac Bioengn & Bioinformat, Leninskiye Gory 1-73, Moscow 119234, Russia
[8] State Univ, Moscow Inst Phys & Technol, 9 Inst Skiy, Dolgoprudnyi 141700, Russia
[9] Russian Acad Sci, Res Ctr Biotechnol, Inst Bioengn, 2 Leninsky Ave 33, Moscow 119071, Russia
[10] Natl Inst Adv Ind Sci & Technol, CBBD OIL, AIST Tokyo Waterfront Main Bldg 323,2-3-26 Aomi, Tokyo 1350064, Japan
[11] KAUST, CBRC, Thuwal 239556900, Saudi Arabia
基金
俄罗斯科学基金会;
关键词
OPEN-ACCESS DATABASE; SITES; GENE; EXPANSION; MOTIFS;
D O I
10.1093/nar/gkx1106
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
引用
收藏
页码:D252 / D259
页数:8
相关论文
共 45 条
[1]   The single nucleotide variant rs12722489 determines differential estrogen receptor binding and enhancer properties of an IL2RA intronic region [J].
Afanasyeva, Marina A. ;
Putlyaeva, Lidia V. ;
Demin, Denis E. ;
Kulakovskiy, Ivan V. ;
Vorontsov, Ilya E. ;
Fridman, Marina V. ;
Makeev, Vsevolod J. ;
Kuprash, Dmitry V. ;
Schwartz, Anton M. .
PLOS ONE, 2017, 12 (02)
[2]   Promoter Analysis Reveals Globally Differential Regulation of Human Long Non-Coding RNA and Protein-Coding Genes [J].
Alam, Tanvir ;
Medvedeva, Yulia A. ;
Jia, Hui ;
Brown, James B. ;
Lipovich, Leonard ;
Bajic, Vladimir B. .
PLOS ONE, 2014, 9 (10)
[3]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[4]   The MEME Suite [J].
Bailey, Timothy L. ;
Johnson, James ;
Grant, Charles E. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2015, 43 (W1) :W39-W49
[5]   ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs [J].
Balwierz, Piotr J. ;
Pachkov, Mikhail ;
Arnold, Phil ;
Gruber, Andreas J. ;
Zavolan, Mihaela ;
van Nimwegen, Erik .
GENOME RESEARCH, 2014, 24 (05) :869-884
[6]   Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells [J].
Boeva, Valentina .
FRONTIERS IN GENETICS, 2016, 7
[7]   The Genetics of Transcription Factor DNA Binding Variation [J].
Deplancke, Bart ;
Alpern, Daniel ;
Gardeux, Vincent .
CELL, 2016, 166 (03) :538-554
[8]   YB-1 (YBX1) does not bind to Y/CCAAT boxes in vivo [J].
Dolfini, D. ;
Mantovani, R. .
ONCOGENE, 2013, 32 (35) :4189-4190
[9]   Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data [J].
Eggeling, Ralf ;
Roos, Teemu ;
Myllymaki, Petri ;
Grosse, Ivo .
BMC BIOINFORMATICS, 2015, 16
[10]   A promoter-level mammalian expression atlas [J].
Forrest, Alistair R. R. ;
Kawaji, Hideya ;
Rehli, Michael ;
Baillie, J. Kenneth ;
de Hoon, Michiel J. L. ;
Haberle, Vanja ;
Lassmann, Timo ;
Kulakovskiy, Ivan V. ;
Lizio, Marina ;
Itoh, Masayoshi ;
Andersson, Robin ;
Mungall, Christopher J. ;
Meehan, Terrence F. ;
Schmeier, Sebastian ;
Bertin, Nicolas ;
Jorgensen, Mette ;
Dimont, Emmanuel ;
Arner, Erik ;
Schmidl, Christian ;
Schaefer, Ulf ;
Medvedeva, Yulia A. ;
Plessy, Charles ;
Vitezic, Morana ;
Severin, Jessica ;
Semple, Colin A. ;
Ishizu, Yuri ;
Young, Robert S. ;
Francescatto, Margherita ;
Alam, Intikhab ;
Albanese, Davide ;
Altschuler, Gabriel M. ;
Arakawa, Takahiro ;
Archer, John A. C. ;
Arner, Peter ;
Babina, Magda ;
Rennie, Sarah ;
Balwierz, Piotr J. ;
Beckhouse, Anthony G. ;
Pradhan-Bhatt, Swati ;
Blake, Judith A. ;
Blumenthal, Antje ;
Bodega, Beatrice ;
Bonetti, Alessandro ;
Briggs, James ;
Brombacher, Frank ;
Burroughs, A. Maxwell ;
Califano, Andrea ;
Cannistraci, Carlo V. ;
Carbajo, Daniel ;
Chen, Yun .
NATURE, 2014, 507 (7493) :462-+