Systematic discovery of conservation states for single-nucleotide annotation of the human genome

被引:15
作者
Arneson, Adriana [1 ,2 ]
Ernst, Jason [1 ,2 ,3 ,4 ,5 ,6 ]
机构
[1] Univ Calif Los Angeles, Bioinformat Interdept Program, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Biol Chem, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Eli & Edythe Broad Ctr Regenerat Med & Stem Cell, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
[5] Univ Calif Los Angeles, Jonsson Comprehens Canc Ctr, Los Angeles, CA 90095 USA
[6] Univ Calif Los Angeles, Inst Mol Biol, Los Angeles, CA 90095 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
PARTITIONING HERITABILITY; FUNCTIONAL ANNOTATION; POINT MUTATIONS; DNA ELEMENTS; VARIANTS; EVOLUTION; PATHOGENICITY; PREDICTION; VERTEBRATE; CONSTRAINT;
D O I
10.1038/s42003-019-0488-1
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo 'conservation states' based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants.
引用
收藏
页数:14
相关论文
共 56 条
  • [1] [Anonymous], 2015, PHEATMAP PRETTY HEAT
  • [2] [Anonymous], 2013, REPEATMASKER OPEN 40
  • [3] Bar-Joseph Z, 2001, Bioinformatics, V17 Suppl 1, pS22
  • [4] Aligning multiple genomic sequences with the threaded blockset aligner
    Blanchette, M
    Kent, WJ
    Riemer, C
    Elnitski, L
    Smit, AFA
    Roskin, KM
    Baertsch, R
    Rosenbloom, K
    Clawson, H
    Green, ED
    Haussler, D
    Miller, W
    [J]. GENOME RESEARCH, 2004, 14 (04) : 708 - 715
  • [5] Comparative assessment of methods for aligning multiple genome sequences
    Chen, Xiaoyu
    Tompa, Martin
    [J]. NATURE BIOTECHNOLOGY, 2010, 28 (06) : 567 - U53
  • [6] Distribution and intensity of constraint in mammalian genomic sequence
    Cooper, GM
    Stone, EA
    Asimenos, G
    Green, ED
    Batzoglou, S
    Sidow, A
    [J]. GENOME RESEARCH, 2005, 15 (07) : 901 - 913
  • [7] Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data
    Cooper, Gregory M.
    Shendure, Jay
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (09) : 628 - 640
  • [8] The Evolution of Lineage-Specific Regulatory Activities in the Human Embryonic Limb
    Cotney, Justin
    Leng, Jing
    Yin, Jun
    Reilly, Steven K.
    DeMare, Laura E.
    Emera, Deena
    Ayoub, Albert E.
    Rakic, Pasko
    Noonan, James P.
    [J]. CELL, 2013, 154 (01) : 185 - 196
  • [9] Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP plus
    Davydov, Eugene V.
    Goode, David L.
    Sirota, Marina
    Cooper, Gregory M.
    Sidow, Arend
    Batzoglou, Serafim
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (12)
  • [10] The human noncoding genome defined by genetic diversity
    di Iulio, Julia
    Bartha, Istvan
    Wong, Emily H. M.
    Yu, Hung-Chun
    Lavrenko, Victor
    Yang, Dongchan
    Jung, Inkyung
    Hicks, Michael A.
    Shah, Naisha
    Kirkness, Ewen F.
    Fabani, Martin M.
    Biggs, William H.
    Ren, Bing
    Venter, J. Craig
    Telenti, Amalio
    [J]. NATURE GENETICS, 2018, 50 (03) : 333 - +