Characterizing the Major Structural Variant Alleles of the Human Genome

被引:324
作者
Audano, Peter A. [1 ]
Sulovari, Arvis [1 ]
Graves-Lindsay, Tina A. [2 ]
Cantsilieris, Stuart [1 ]
Sorensen, Melanie [1 ]
Welch, AnneMarie E. [1 ]
Dougherty, Max L. [1 ]
Nelson, Bradley J. [1 ]
Shah, Ankeeta [3 ]
Dutcher, Susan K. [2 ]
Warren, Wesley C. [2 ]
Magrini, Vincent [4 ,5 ]
McGrath, Sean D. [4 ]
Li, Yang I. [6 ,7 ]
Wilson, Richard K. [4 ,5 ]
Eichler, Evan E. [1 ,8 ]
机构
[1] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[2] Washington Univ, Sch Med, McDonnell Genome Inst, Dept Genet, St Louis, MO 63108 USA
[3] Univ Chicago, Comm Genet Genom & Syst Biol, Chicago, IL 60637 USA
[4] Nationwide Childrens Hosp, Inst Genom Med, Columbus, OH 43205 USA
[5] Ohio State Univ, Coll Med, Columbus, OH 43210 USA
[6] Univ Chicago, Sect Genet Med, Chicago, IL 60637 USA
[7] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[8] Univ Washington, Howard Hughes Med Inst, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 英国医学研究理事会;
关键词
LARGE-SCALE VARIATION; GENETIC-VARIATION; SEQUENCE; DIVERSITY; RETROTRANSPOSITION; RECOMBINATION; ASSEMBLIES; PATTERNS;
D O I
10.1016/j.cell.2018.12.019
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and non-coding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.
引用
收藏
页码:663 / +
页数:32
相关论文
共 62 条
[1]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[2]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[3]  
[Anonymous], 2017, BIORXIV, DOI DOI 10.1101/101378
[4]   LINE-1 Retrotransposition Activity in Human Genomes [J].
Beck, Christine R. ;
Collier, Pamela ;
Macfarlane, Catriona ;
Malig, Maika ;
Kidd, Jeffrey M. ;
Eichler, Evan E. ;
Badge, Richard M. ;
Moran, John V. .
CELL, 2010, 141 (07) :1159-U110
[5]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[6]   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing [J].
Berlin, Konstantin ;
Koren, Sergey ;
Chin, Chen-Shan ;
Drake, James P. ;
Landolin, Jane M. ;
Phillippy, Adam M. .
NATURE BIOTECHNOLOGY, 2015, 33 (06) :623-+
[7]   Mapping Bias Overestimates Reference Allele Frequencies at the HLA Genes in the 1000 Genomes Project Phase I Data [J].
Brandt, Debora Y. C. ;
Aguiar, Vitor R. C. ;
Bitarello, Barbara D. ;
Nunes, Kelly ;
Goudet, Jerome ;
Meyer, Diogo .
G3-GENES GENOMES GENETICS, 2015, 5 (05) :931-941
[8]   APPLICATIONS OF NEXT-GENERATION SEQUENCING Genetic variation and the de novo assembly of human genomes [J].
Chaisson, Mark J. P. ;
Wilson, Richard K. ;
Eichler, Evan E. .
NATURE REVIEWS GENETICS, 2015, 16 (11) :627-640
[9]   Resolving the complexity of the human genome using single-molecule sequencing [J].
Chaisson, Mark J. P. ;
Huddleston, John ;
Dennis, Megan Y. ;
Sudmant, Peter H. ;
Malig, Maika ;
Hormozdiari, Fereydoun ;
Antonacci, Francesca ;
Surti, Urvashi ;
Sandstrom, Richard ;
Boitano, Matthew ;
Landolin, Jane M. ;
Stamatoyannopoulos, John A. ;
Hunkapiller, Michael W. ;
Korlach, Jonas ;
Eichler, Evan E. .
NATURE, 2015, 517 (7536) :608-U163
[10]  
Chin CS, 2013, NAT METHODS, V10, P563, DOI [10.1038/nmeth.2474, 10.1038/NMETH.2474]