Characterizing the Major Structural Variant Alleles of the Human Genome

被引:324
作者
Audano, Peter A. [1 ]
Sulovari, Arvis [1 ]
Graves-Lindsay, Tina A. [2 ]
Cantsilieris, Stuart [1 ]
Sorensen, Melanie [1 ]
Welch, AnneMarie E. [1 ]
Dougherty, Max L. [1 ]
Nelson, Bradley J. [1 ]
Shah, Ankeeta [3 ]
Dutcher, Susan K. [2 ]
Warren, Wesley C. [2 ]
Magrini, Vincent [4 ,5 ]
McGrath, Sean D. [4 ]
Li, Yang I. [6 ,7 ]
Wilson, Richard K. [4 ,5 ]
Eichler, Evan E. [1 ,8 ]
机构
[1] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[2] Washington Univ, Sch Med, McDonnell Genome Inst, Dept Genet, St Louis, MO 63108 USA
[3] Univ Chicago, Comm Genet Genom & Syst Biol, Chicago, IL 60637 USA
[4] Nationwide Childrens Hosp, Inst Genom Med, Columbus, OH 43205 USA
[5] Ohio State Univ, Coll Med, Columbus, OH 43210 USA
[6] Univ Chicago, Sect Genet Med, Chicago, IL 60637 USA
[7] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[8] Univ Washington, Howard Hughes Med Inst, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 英国医学研究理事会;
关键词
LARGE-SCALE VARIATION; GENETIC-VARIATION; SEQUENCE; DIVERSITY; RETROTRANSPOSITION; RECOMBINATION; ASSEMBLIES; PATTERNS;
D O I
10.1016/j.cell.2018.12.019
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and non-coding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.
引用
收藏
页码:663 / +
页数:32
相关论文
共 62 条
[11]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[12]   L1 retrotransposition in human neural progenitor cells [J].
Coufal, Nicole G. ;
Garcia-Perez, Jose L. ;
Peng, Grace E. ;
Yeo, Gene W. ;
Mu, Yangling ;
Lovci, Michael T. ;
Morell, Maria ;
O'Shea, K. Sue ;
Moran, John V. ;
Gage, Fred H. .
NATURE, 2009, 460 (7259) :1127-1131
[13]   Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data [J].
Degner, Jacob F. ;
Marioni, John C. ;
Pai, Athma A. ;
Pickrell, Joseph K. ;
Nkadori, Everlyne ;
Gilad, Yoav ;
Pritchard, Jonathan K. .
BIOINFORMATICS, 2009, 25 (24) :3207-3212
[14]   An assessment of the sequence gaps: Unfinished business in a finished human genome [J].
Eichler, EE ;
Clark, RA ;
She, XW .
NATURE REVIEWS GENETICS, 2004, 5 (05) :345-354
[15]   Genome-wide patterns and properties of de novo mutations in humans [J].
Francioli, Laurent C. ;
Polak, Paz P. ;
Koren, Amnon ;
Menelaou, Androniki ;
Chun, Sung ;
Renkens, Ivo ;
van Duijn, Cornelia M. ;
Swertz, Morris ;
Wijmenga, Cisca ;
van Ommen, Gertjan ;
Slagboom, P. Eline ;
Boomsma, Dorret I. ;
Ye, Kai ;
Guryev, Victor ;
Arndt, Peter F. ;
Kloosterman, Wigard P. ;
de Bakker, Paul I. W. ;
Sunyaev, Shamil R. .
NATURE GENETICS, 2015, 47 (07) :822-+
[16]   Whole-genome sequence variation, population structure and demographic history of the Dutch population [J].
Francioli, Laurent C. ;
Menelaou, Andronild ;
Pulit, Sara L. ;
Van Dijk, Freerk ;
Palamara, Pier Francesco ;
Elbers, Clara C. ;
Neerincx, Pieter B. T. ;
Ye, Kai ;
Guryev, Victor ;
Kloosterman, Wigard P. ;
Deelen, Patrick ;
Abdellaoui, Abdel ;
Van Leeuwen, Elisabeth M. ;
Van Oven, Mannis ;
Vermaat, Martijn ;
Li, Mingkun ;
Laros, Jeroen F. J. ;
Karssen, Lennart C. ;
Kanterakis, Alexandros ;
Amin, Najaf ;
Hottenga, Jouke Jan ;
Lameijer, Eric-Wubbo ;
Kattenberg, Mathijs ;
Dijkstra, Martijn ;
Byelas, Heorhiy ;
Van Settenl, Jessica ;
Van Schaik, Barbera D. C. ;
Bot, Jan ;
Nijman, Isaac J. ;
Renkens, Ivo ;
Marscha, Tobias ;
Schonhuth, Alexander ;
Hehir-Kwa, Jayne Y. ;
Handsaker, Robert E. ;
Polak, Paz ;
Sohail, Mashaal ;
Vuzman, Dana ;
Hormozdiari, Fereydoun ;
Van Enckevort, David ;
Mei, Hailiang ;
Koval, Vyacheslav ;
Moed, Ma-Tthijs H. ;
Van der Velde, K. Joeri ;
Rivadeneira, Fernando ;
Estrada, Karol ;
Medina-Gomez, Carolina ;
Isaacs, Aaron ;
McCarroll, Steven A. ;
Beekrnan, Marian ;
De Craen, Anton J. M. .
NATURE GENETICS, 2014, 46 (08) :818-825
[17]   Variation graph toolkit improves read mapping by representing genetic variation in the reference [J].
Garrison, Erik ;
Siren, Jouni ;
Novak, Adam M. ;
Hickey, Glenn ;
Eizenga, Jordan M. ;
Dawson, Eric T. ;
Jones, William ;
Garg, Shilpa ;
Markello, Charles ;
Lin, Michael F. ;
Paten, Benedict ;
Durbin, Richard .
NATURE BIOTECHNOLOGY, 2018, 36 (09) :875-+
[18]   Diseases of unstable repeat expansion: Mechanisms and common principles [J].
Gatchel, JR ;
Zoghbi, HY .
NATURE REVIEWS GENETICS, 2005, 6 (10) :743-755
[19]   karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data [J].
Gel, Bernat ;
Serra, Eduard .
BIOINFORMATICS, 2017, 33 (19) :3088-3090
[20]   Long-read sequence assembly of the gorilla genome [J].
Gordon, David ;
Huddleston, John ;
Chaisson, Mark J. P. ;
Hill, Christopher M. ;
Kronenberg, Zev N. ;
Munson, Katherine M. ;
Malig, Maika ;
Raja, Archana ;
Fiddes, Ian ;
Hillier, LaDeana W. ;
Dunn, Christopher ;
Baker, Carl ;
Armstrong, Joel ;
Diekhans, Mark ;
Paten, Benedict ;
Shendure, Jay ;
Wilson, Richard K. ;
Haussler, David ;
Chin, Chen-Shan ;
Eichler, Evan E. .
SCIENCE, 2016, 352 (6281)