Characterization of missing human genome sequences and copy-number polymorphic insertions

被引:100
|
作者
Kidd, Jeffrey M. [1 ]
Sampas, Nick [2 ]
Antonacci, Francesca [1 ]
Graves, Tina [3 ]
Fulton, Robert [3 ]
Hayden, Hillary S. [1 ]
Alkan, Can [1 ]
Malig, Maika [1 ]
Ventura, Mario [4 ]
Giannuzzi, Giuliana [4 ]
Kallicki, Joelle [3 ]
Anderson, Paige [2 ]
Tsalenko, Anya [2 ]
Yamada, N. Alice [2 ]
Tsang, Peter [2 ]
Kaul, Rajinder [1 ]
Wilson, Richard K. [3 ]
Bruhn, Laurakay [2 ]
Eichler, Evan E. [1 ,5 ]
机构
[1] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[2] Agilent Labs, Santa Clara, CA USA
[3] Washington Univ, Sch Med, Genome Sequencing Ctr, St Louis, MO USA
[4] Univ Bari, Dept Genet & Microbiol, Bari, Italy
[5] Howard Hughes Med Inst, Seattle, WA USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
STRUCTURAL VARIATION; FINE-SCALE; MAP;
D O I
10.1038/NMETH.1451
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The extent of human genomic structural variation suggests that there must be portions of the genome yet to be discovered, annotated and characterized at the sequence level. We present a resource and analysis of 2,363 new insertion sequences corresponding to 720 genomic loci. We found that a substantial fraction of these sequences are either missing, fragmented or misassigned when compared to recent de novo sequence assemblies from short-read next-generation sequence data. We determined that 18-37% of these new insertions are copy-number polymorphic, including loci that show extensive population stratification among Europeans, Asians and Africans. Complete sequencing of 156 of these insertions identified new exons and conserved noncoding sequences not yet represented in the reference genome. We developed a method to accurately genotype these new insertions by mapping next-generation sequencing datasets to the breakpoint, thereby providing a means to characterize copy-number status for regions previously inaccessible to single-nucleotide polymorphism microarrays.
引用
收藏
页码:365 / U47
页数:8
相关论文
共 50 条
  • [1] Characterization of missing human genome sequences and copy-number polymorphic insertions
    Kidd J.M.
    Sampas N.
    Antonacci F.
    Graves T.
    Fulton R.
    Hayden H.S.
    Alkan C.
    Malig M.
    Ventura M.
    Giannuzzi G.
    Kallicki J.
    Anderson P.
    Tsalenko A.
    Yamada N.A.
    Tsang P.
    Kaul R.
    Wilson R.K.
    Bruhn L.
    Eichler E.E.
    Nature Methods, 2010, 7 (5) : 365 - 371
  • [2] Copy-number variation: the end of the human genome?
    Dear, Paul H.
    TRENDS IN BIOTECHNOLOGY, 2009, 27 (08) : 448 - 454
  • [3] Copy-Number Variations, Noncoding Sequences, and Human Phenotypes
    Klopocki, Eva
    Mundlos, Stefan
    ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, VOL 12, 2011, 12 : 53 - 72
  • [4] Segmental duplications and copy-number variation in the human genome
    Sharp, AJ
    Locke, DP
    McGrath, SD
    Cheng, Z
    Bailey, JA
    Vallente, RU
    Pertz, LM
    Clark, RA
    Schwartz, S
    Segraves, R
    Oseroff, VV
    Albertson, DG
    Pinkel, D
    Eichler, EE
    AMERICAN JOURNAL OF HUMAN GENETICS, 2005, 77 (01) : 78 - 88
  • [5] Implications of copy-number variation in the human genome: a time for questions
    Abdallah S. Daar
    Stephen W. Scherer
    Robert A. Hegele
    Nature Reviews Genetics, 2006, 7 : 414 - 414
  • [6] Mutational and selective effects on copy-number variants in the human genome
    Cooper, Gregory M.
    Nickerson, Deborah A.
    Eichler, Evan E.
    NATURE GENETICS, 2007, 39 (Suppl 7) : S22 - S29
  • [7] Mutational and selective effects on copy-number variants in the human genome
    Gregory M Cooper
    Deborah A Nickerson
    Evan E Eichler
    Nature Genetics, 2007, 39 : S22 - S29
  • [8] A comprehensive analysis of common copy-number variations in the human genome
    Wong, Kendy K.
    deLeeuw, Ronald J.
    Dosanjh, Nirpjit S.
    Kimm, Lindsey R.
    Cheng, Ze
    Horsman, Douglas E.
    MacAulay, Calum
    Ng, Raymond T.
    Brown, Carolyn J.
    Eichler, Evan E.
    Lam, Wan L.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 80 (01) : 91 - 104
  • [9] Copy-number variations add a new layer of complexity in the human genome
    Hegele, Robert A.
    CANADIAN MEDICAL ASSOCIATION JOURNAL, 2007, 176 (04) : 441 - 442
  • [10] TOMATO GENOME IS COMPRISED LARGELY OF FAST-EVOLVING, LOW COPY-NUMBER SEQUENCES
    ZAMIR, D
    TANKSLEY, SD
    MOLECULAR & GENERAL GENETICS, 1988, 213 (2-3): : 254 - 261