Genetic Variation in an Individual Human Exome

被引:191
作者
Ng, Pauline C. [1 ]
Levy, Samuel [1 ]
Huang, Jiaqi [1 ]
Stockwell, Timothy B. [1 ]
Walenz, Brian P. [1 ]
Li, Kelvin [1 ]
Axelrod, Nelson [1 ]
Busam, Dana A. [1 ]
Strausberg, Robert L. [1 ]
Venter, J. Craig [1 ]
机构
[1] J Craig Venter Inst, Rockville, MD USA
来源
PLOS GENETICS | 2008年 / 4卷 / 08期
关键词
D O I
10.1371/journal.pgen.1000160
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
There is much interest in characterizing the variation in a human individual, because this may elucidate what contributes significantly to a person's phenotype, thereby enabling personalized genomics. We focus here on the variants in a person's 'exome,' which is the set of exons in a genome, because the exome is believed to harbor much of the functional variation. We provide an analysis of the,12,500 variants that affect the protein coding portion of an individual's genome. We identified,10,400 nonsynonymous single nucleotide polymorphisms (nsSNPs) in this individual, of which,15-20% are rare in the human population. We predict,1,500 nsSNPs affect protein function and these tend be heterozygous, rare, or novel. Of the,700 coding indels, approximately half tend to have lengths that are a multiple of three, which causes insertions/deletions of amino acids in the corresponding protein, rather than introducing frameshifts. Coding indels also occur frequently at the termini of genes, so even if an indel causes a frameshift, an alternative start or stop site in the gene can still be used to make a functional protein. In summary, we reduced the set of,12,500 nonsilent coding variants by similar to 8-fold to a set of variants that are most likely to have major effects on their proteins' functions. This is our first glimpse of an individual's exome and a snapshot of the current state of personalized genomics. The majority of coding variants in this individual are common and appear to be functionally neutral. Our results also indicate that some variants can be used to improve the current NCBI human reference genome. As more genomes are sequenced, many rare variants and non-SNP variants will be discovered. We present an approach to analyze the coding variation in humans by proposing multiple bioinformatic methods to hone in on possible functional variation.
引用
收藏
页数:15
相关论文
共 101 条
  • [11] The human chitotriosidase gene - Nature of inherited enzyme deficiency
    Boot, RG
    Renkema, GH
    Verhoek, M
    Strijland, A
    Bliek, J
    de Meulemeester, TMAMO
    Mannens, MMAM
    Aerts, JMFG
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 1998, 273 (40) : 25680 - 25685
  • [12] Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease
    Botstein, D
    Risch, N
    [J]. NATURE GENETICS, 2003, 33 (Suppl 3) : 228 - 237
  • [13] Online Mendelian Inheritance in Man (OMIM) as a knowledgebase for human developmental disorders
    Boyadjiev, SA
    Jabs, EW
    [J]. CLINICAL GENETICS, 2000, 57 (04) : 253 - 266
  • [14] Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
    Burton, Paul R.
    Clayton, David G.
    Cardon, Lon R.
    Craddock, Nick
    Deloukas, Panos
    Duncanson, Audrey
    Kwiatkowski, Dominic P.
    McCarthy, Mark I.
    Ouwehand, Willem H.
    Samani, Nilesh J.
    Todd, John A.
    Donnelly, Peter
    Barrett, Jeffrey C.
    Davison, Dan
    Easton, Doug
    Evans, David
    Leung, Hin-Tak
    Marchini, Jonathan L.
    Morris, Andrew P.
    Spencer, Chris C. A.
    Tobin, Martin D.
    Attwood, Antony P.
    Boorman, James P.
    Cant, Barbara
    Everson, Ursula
    Hussey, Judith M.
    Jolley, Jennifer D.
    Knight, Alexandra S.
    Koch, Kerstin
    Meech, Elizabeth
    Nutland, Sarah
    Prowse, Christopher V.
    Stevens, Helen E.
    Taylor, Niall C.
    Walters, Graham R.
    Walker, Neil M.
    Watkins, Nicholas A.
    Winzer, Thilo
    Jones, Richard W.
    McArdle, Wendy L.
    Ring, Susan M.
    Strachan, David P.
    Pembrey, Marcus
    Breen, Gerome
    St Clair, David
    Caesar, Sian
    Gordon-Smith, Katherine
    Jones, Lisa
    Fraser, Christine
    Green, Elain K.
    [J]. NATURE, 2007, 447 (7145) : 661 - 678
  • [15] Natural selection on protein-coding genes in the human genome
    Bustamante, CD
    Fledel-Alon, A
    Williamson, S
    Nielsen, R
    Hubisz, MT
    Glanowski, S
    Tanenbaum, DM
    White, TJ
    Sninsky, JJ
    Hernandez, RD
    Civello, D
    Adams, MD
    Cargill, M
    Clark, AG
    [J]. NATURE, 2005, 437 (7062) : 1153 - 1157
  • [16] Characterization of single-nucleotide polymorphisms in coding regions of human genes
    Cargill, M
    Altshuler, D
    Ireland, J
    Sklar, P
    Ardlie, K
    Patil, N
    Lane, CR
    Lim, EP
    Kalyanaraman, N
    Nemesh, J
    Ziaugra, L
    Friedland, L
    Rolfe, A
    Warrington, J
    Lipshutz, R
    Daley, GQ
    Lander, ES
    [J]. NATURE GENETICS, 1999, 22 (03) : 231 - 238
  • [17] Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans
    Carlson, CS
    Eberle, MA
    Rieder, MJ
    Smith, JD
    Kruglyak, L
    Nickerson, DA
    [J]. NATURE GENETICS, 2003, 33 (04) : 518 - 521
  • [18] Population genetics - making sense out of sequence
    Chakravarti, A
    [J]. NATURE GENETICS, 1999, 21 (Suppl 1) : 56 - 60
  • [19] Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation
    Chasman, D
    Adams, RM
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (02) : 683 - 706
  • [20] *CHIMP SEQ AN CONS, 2005, NATURE, V0437