Resolving the complexity of the human genome using single-molecule sequencing

被引:526
作者
Chaisson, Mark J. P. [1 ]
Huddleston, John [1 ,2 ]
Dennis, Megan Y. [1 ]
Sudmant, Peter H. [1 ]
Malig, Maika [1 ]
Hormozdiari, Fereydoun [1 ]
Antonacci, Francesca [3 ]
Surti, Urvashi [4 ]
Sandstrom, Richard [1 ]
Boitano, Matthew [5 ]
Landolin, Jane M. [5 ]
Stamatoyannopoulos, John A. [1 ]
Hunkapiller, Michael W. [5 ]
Korlach, Jonas [5 ]
Eichler, Evan E. [1 ,2 ]
机构
[1] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ Washington, Howard Hughes Med Inst, Seattle, WA 98195 USA
[3] Univ Bari Aldo Moro, Dipartimento Biol, I-70125 Bari, Italy
[4] Univ Pittsburgh, Dept Pathol, Pittsburgh, PA 15261 USA
[5] Pacific Biosci Calif Inc, Menlo Pk, CA 94025 USA
基金
美国国家卫生研究院;
关键词
ALIGNMENT; PROGRAM; GAPS;
D O I
10.1038/nature13907
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The human genome is arguably the most complete mammalian reference assembly(1-3), yetmore than 160 euchromatic gaps remain(4-6) and aspects of its structural variation remain poorly understood ten years after its completion(7-9). To identify missing sequence and genetic variation, here we sequence and analyse ahaploid human genome (CHM1) using single-molecule, real-time DNA sequencing(10). We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome-78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3: 1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.
引用
收藏
页码:608 / U163
页数:11
相关论文
共 27 条
[11]   Using population admixture to help complete maps of the human genome [J].
Genovese, Giulio ;
Handsaker, Robert E. ;
Li, Heng ;
Altemose, Nicolas ;
Lindgren, Amelia M. ;
Chambert, Kimberly ;
Pasaniuc, Bogdan ;
Price, Alkes L. ;
Reich, David ;
Morton, Cynthia C. ;
Pollak, Martin R. ;
Wilson, James G. ;
McCarroll, Steven A. .
NATURE GENETICS, 2013, 45 (04) :406-414
[12]   Reconstructing complex regions of genomes using long-read sequencing technology [J].
Huddleston, John ;
Ranade, Swati ;
Malig, Maika ;
Antonacci, Francesca ;
Chaisson, Mark ;
Hon, Lawrence ;
Sudmant, Peter H. ;
Graves, Tina A. ;
Alkan, Can ;
Dennis, Megan Y. ;
Wilson, Richard K. ;
Turner, Stephen W. ;
Korlach, Jonas ;
Eichler, Evan E. .
GENOME RESEARCH, 2014, 24 (04) :688-696
[13]   Censor - A program for identification and elimination of repetitive elements from DNA sequences [J].
Jurka, J ;
Klonowski, P ;
Dagman, V ;
Pelton, P .
COMPUTERS & CHEMISTRY, 1996, 20 (01) :119-121
[14]   A Human Genome Structural Variation Sequencing Resource Reveals Insights into Mutational Mechanisms [J].
Kidd, Jeffrey M. ;
Graves, Tina ;
Newman, Tera L. ;
Fulton, Robert ;
Hayden, Hillary S. ;
Malig, Maika ;
Kallicki, Joelle ;
Kaul, Rajinder ;
Wilson, Richard K. ;
Eichler, Evan E. .
CELL, 2010, 143 (05) :837-847
[15]   A vast collection of microbial genes that are toxic to bacteria [J].
Kimelman, Aya ;
Levy, Asaf ;
Sberro, Hila ;
Kidron, Shahar ;
Leavitt, Azita ;
Amitai, Gil ;
Yoder-Himes, Deborah R. ;
Wurtzel, Omri ;
Zhu, Yiwen ;
Rubin, Edward M. ;
Sorek, Rotem .
GENOME RESEARCH, 2012, 22 (04) :802-809
[16]   A high-resolution recombination map of the human genome [J].
Kong, A ;
Gudbjartsson, DF ;
Sainz, J ;
Jonsdottir, GM ;
Gudjonsson, SA ;
Richardsson, B ;
Sigurdardottir, S ;
Barnard, J ;
Hallbeck, B ;
Masson, G ;
Shlien, A ;
Palsson, ST ;
Frigge, ML ;
Thorgeirsson, TE ;
Gulcher, JR ;
Stefansson, K .
NATURE GENETICS, 2002, 31 (03) :241-247
[17]   Molecular cloning of a translocation breakpoint hotspot in 22q11 [J].
Kurahashi, Hiroki ;
Inagaki, Hidehito ;
Hosoba, Eriko ;
Kato, Takema ;
Ohye, Tamae ;
Kogo, Hiroshi ;
Emanuel, Beverly S. .
GENOME RESEARCH, 2007, 17 (04) :461-469
[18]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[19]   Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score [J].
Lee, Hayan ;
Schatz, Michael C. .
BIOINFORMATICS, 2012, 28 (16) :2097-2105
[20]   Mapping copy number variation by population-scale genome sequencing [J].
Mills, Ryan E. ;
Walter, Klaudia ;
Stewart, Chip ;
Handsaker, Robert E. ;
Chen, Ken ;
Alkan, Can ;
Abyzov, Alexej ;
Yoon, Seungtai Chris ;
Ye, Kai ;
Cheetham, R. Keira ;
Chinwalla, Asif ;
Conrad, Donald F. ;
Fu, Yutao ;
Grubert, Fabian ;
Hajirasouliha, Iman ;
Hormozdiari, Fereydoun ;
Iakoucheva, Lilia M. ;
Iqbal, Zamin ;
Kang, Shuli ;
Kidd, Jeffrey M. ;
Konkel, Miriam K. ;
Korn, Joshua ;
Khurana, Ekta ;
Kural, Deniz ;
Lam, Hugo Y. K. ;
Leng, Jing ;
Li, Ruiqiang ;
Li, Yingrui ;
Lin, Chang-Yun ;
Luo, Ruibang ;
Mu, Xinmeng Jasmine ;
Nemesh, James ;
Peckham, Heather E. ;
Rausch, Tobias ;
Scally, Aylwyn ;
Shi, Xinghua ;
Stromberg, Michael P. ;
Stuetz, Adrian M. ;
Urban, Alexander Eckehart ;
Walker, Jerilyn A. ;
Wu, Jiantao ;
Zhang, Yujun ;
Zhang, Zhengdong D. ;
Batzer, Mark A. ;
Ding, Li ;
Marth, Gabor T. ;
McVean, Gil ;
Sebat, Jonathan ;
Snyder, Michael ;
Wang, Jun .
NATURE, 2011, 470 (7332) :59-65