Integrative annotation of chromatin elements from ENCODE data

被引:352
作者
Hoffman, Michael M. [1 ]
Ernst, Jason [2 ,3 ]
Wilder, Steven P. [4 ]
Kundaje, Anshul [5 ]
Harris, Robert S. [6 ]
Libbrecht, Max [1 ,7 ]
Giardine, Belinda [6 ]
Ellenbogen, Paul M. [1 ,7 ]
Bilmes, Jeffrey A. [8 ]
Birney, Ewan [4 ]
Hardison, Ross C. [6 ]
Dunham, Ian [4 ]
Kellis, Manolis [2 ,3 ]
Noble, William Stafford [1 ,7 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Broad Inst MIT & Harvard, Cambridge Ctr 7, Cambridge, MA 02142 USA
[4] EMBL European Bioinformat Inst, Cambridge CB10 1SD, England
[5] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[6] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA
[7] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
[8] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
GAMMA-GLOBIN GENE; HEREDITARY PERSISTENCE; FETAL-HEMOGLOBIN; TRANSCRIPTION; SUSCEPTIBILITY; DISCOVERY; ASSOCIATION; BREAKPOINT; EXPRESSION; VERTEBRATE;
D O I
10.1093/nar/gks1284
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate an interpretable summary of the massive datasets of the ENCODE Project, we apply unsupervised learning methodologies, converting dozens of chromatin datasets into discrete annotation maps of regulatory regions and other chromatin elements across the human genome. These methods rediscover and summarize diverse aspects of chromatin architecture, elucidate the interplay between chromatin activity and RNA transcription, and reveal that a large proportion of the genome lies in a quiescent state, even across multiple cell types. The resulting annotation of non-coding regulatory elements correlate strongly with mammalian evolutionary constraint, and provide an unbiased approach for evaluating metrics of evolutionary constraint in human. Lastly, we use the regulatory annotations to revisit previously uncharacterized disease-associated loci, resulting in focused, testable hypotheses through the lens of the chromatin landscape.
引用
收藏
页码:827 / 841
页数:15
相关论文
共 57 条
  • [1] Toward a gold standard for promoter prediction evaluation
    Abeel, Thomas
    Van de Peer, Yves
    Saeys, Yvan
    [J]. BIOINFORMATICS, 2009, 25 (12) : I313 - I320
  • [2] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [3] SEQUENCES LOCATED 3' TO THE BREAKPOINT OF THE HEREDITARY PERSISTENCE OF FETAL HEMOGLOBIN-3 DELETION EXHIBIT ENHANCER ACTIVITY AND CAN MODIFY THE DEVELOPMENTAL EXPRESSION OF THE HUMAN FETAL A-GAMMA-GLOBIN GENE IN TRANSGENIC MICE
    ANAGNOU, NP
    PEREZSTABLE, C
    GELINAS, R
    COSTANTINI, F
    LIAPAKI, K
    CONSTANTOPOULOU, M
    KOSTEAS, T
    MOSCHONAS, NK
    STAMATOYANNOPOULOS, G
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 1995, 270 (17) : 10256 - 10263
  • [4] The protein CTCF is required for the enhancer blocking activity of vertebrate insulators
    Bell, AC
    West, AG
    Felsenfeld, G
    [J]. CELL, 1999, 98 (03) : 387 - 396
  • [5] A bivalent chromatin structure marks key developmental genes in embryonic stem cells
    Bernstein, BE
    Mikkelsen, TS
    Xie, XH
    Kamal, M
    Huebert, DJ
    Cuff, J
    Fry, B
    Meissner, A
    Wernig, M
    Plath, K
    Jaenisch, R
    Wagschal, A
    Feil, R
    Schreiber, SL
    Lander, ES
    [J]. CELL, 2006, 125 (02) : 315 - 326
  • [6] GeneWise and genomewise
    Birney, E
    Clamp, M
    Durbin, R
    [J]. GENOME RESEARCH, 2004, 14 (05) : 988 - 995
  • [7] Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Birney, Ewan
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigo, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Stamatoyannopoulos, John A.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C. J.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    [J]. NATURE, 2007, 447 (7146) : 799 - 816
  • [8] Exploratory analysis of genomic segmentations with Segtools
    Buske, Orion J.
    Hoffman, Michael M.
    Ponts, Nadia
    Le Roch, Karine G.
    Noble, William Stafford
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [9] Distribution and intensity of constraint in mammalian genomic sequence
    Cooper, GM
    Stone, EA
    Asimenos, G
    Green, ED
    Batzoglou, S
    Sidow, A
    [J]. GENOME RESEARCH, 2005, 15 (07) : 901 - 913
  • [10] Landscape of transcription in human cells
    Djebali, Sarah
    Davis, Carrie A.
    Merkel, Angelika
    Dobin, Alex
    Lassmann, Timo
    Mortazavi, Ali
    Tanzer, Andrea
    Lagarde, Julien
    Lin, Wei
    Schlesinger, Felix
    Xue, Chenghai
    Marinov, Georgi K.
    Khatun, Jainab
    Williams, Brian A.
    Zaleski, Chris
    Rozowsky, Joel
    Roeder, Maik
    Kokocinski, Felix
    Abdelhamid, Rehab F.
    Alioto, Tyler
    Antoshechkin, Igor
    Baer, Michael T.
    Bar, Nadav S.
    Batut, Philippe
    Bell, Kimberly
    Bell, Ian
    Chakrabortty, Sudipto
    Chen, Xian
    Chrast, Jacqueline
    Curado, Joao
    Derrien, Thomas
    Drenkow, Jorg
    Dumais, Erica
    Dumais, Jacqueline
    Duttagupta, Radha
    Falconnet, Emilie
    Fastuca, Meagan
    Fejes-Toth, Kata
    Ferreira, Pedro
    Foissac, Sylvain
    Fullwood, Melissa J.
    Gao, Hui
    Gonzalez, David
    Gordon, Assaf
    Gunawardena, Harsha
    Howald, Cedric
    Jha, Sonali
    Johnson, Rory
    Kapranov, Philipp
    King, Brandon
    [J]. NATURE, 2012, 489 (7414) : 101 - 108