Analysis of sequence conservation at nucleotide resolution

被引:54
作者
Asthana, Saurabh [2 ]
Roytberg, Mikhail [3 ]
Stamatoyannopoulos, John [1 ]
Sunyaev, Shamil [2 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Harvard Univ, Brigham & Womens Hosp, Sch Med, Div Genet, Boston, MA 02115 USA
[3] Russian Acad Sci, Inst Math Problems Biol, Computat Biol Grp, Pushchino 142292, Russia
关键词
D O I
10.1371/journal.pcbi.0030254
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation ( SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (< 15 bp) conserved "chunks.'' Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence.
引用
收藏
页码:2559 / 2568
页数:10
相关论文
共 42 条
  • [1] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [2] Identification and measurement of neighbor-dependent nucleotide substitution processes
    Arndt, PF
    Hwa, T
    [J]. BIOINFORMATICS, 2005, 21 (10) : 2322 - 2328
  • [3] Distinct changes of genomic biases in nucleotide substitution at the time of mammalian radiation
    Arndt, PF
    Petrov, DA
    Hwa, T
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2003, 20 (11) : 1887 - 1896
  • [4] A limited role for balancing selection
    Asthana, S
    Schmidt, S
    Sunyaev, S
    [J]. TRENDS IN GENETICS, 2005, 21 (01) : 30 - 32
  • [5] Widely distributed noncoding purifying selection in the human genome
    Asthana, Saurabh
    Noble, William S.
    Kryukov, Gregory
    Grantt, Charles E.
    Sunyaev, Shamil
    Stamatoyannopoulos, John A.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (30) : 12410 - 12415
  • [6] Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Birney, Ewan
    Stamatoyannopoulos, John A.
    Dutta, Anindya
    Guigo, Roderic
    Gingeras, Thomas R.
    Margulies, Elliott H.
    Weng, Zhiping
    Snyder, Michael
    Dermitzakis, Emmanouil T.
    Stamatoyannopoulos, John A.
    Thurman, Robert E.
    Kuehn, Michael S.
    Taylor, Christopher M.
    Neph, Shane
    Koch, Christoph M.
    Asthana, Saurabh
    Malhotra, Ankit
    Adzhubei, Ivan
    Greenbaum, Jason A.
    Andrews, Robert M.
    Flicek, Paul
    Boyle, Patrick J.
    Cao, Hua
    Carter, Nigel P.
    Clelland, Gayle K.
    Davis, Sean
    Day, Nathan
    Dhami, Pawandeep
    Dillon, Shane C.
    Dorschner, Michael O.
    Fiegler, Heike
    Giresi, Paul G.
    Goldy, Jeff
    Hawrylycz, Michael
    Haydock, Andrew
    Humbert, Richard
    James, Keith D.
    Johnson, Brett E.
    Johnson, Ericka M.
    Frum, Tristan T.
    Rosenzweig, Elizabeth R.
    Karnani, Neerja
    Lee, Kirsten
    Lefebvre, Gregory C.
    Navas, Patrick A.
    Neri, Fidencio
    Parker, Stephen C. J.
    Sabo, Peter J.
    Sandstrom, Richard
    Shafer, Anthony
    [J]. NATURE, 2007, 447 (7146) : 799 - 816
  • [7] Aligning multiple genomic sequences with the threaded blockset aligner
    Blanchette, M
    Kent, WJ
    Riemer, C
    Elnitski, L
    Smit, AFA
    Roskin, KM
    Baertsch, R
    Rosenbloom, K
    Clawson, H
    Green, ED
    Haussler, D
    Miller, W
    [J]. GENOME RESEARCH, 2004, 14 (04) : 708 - 715
  • [8] Comparative genomics at the vertebrate extremes
    Boffelli, D
    Nobrega, MA
    Rubin, EM
    [J]. NATURE REVIEWS GENETICS, 2004, 5 (06) : 456 - 465
  • [9] Phylogenetic shadowing of primate sequences to find functional regions of the human genome
    Boffelli, D
    McAuliffe, J
    Ovcharenko, D
    Lewis, KD
    Ovcharenko, I
    Pachter, L
    Rubin, EM
    [J]. SCIENCE, 2003, 299 (5611) : 1391 - 1394
  • [10] Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals
    Chamary, JV
    Hurst, LD
    [J]. GENOME BIOLOGY, 2005, 6 (09)