Improved genome inference in the MHC using a population reference graph

被引:124
作者
Dilthey, Alexander [1 ]
Cox, Charles [2 ]
Iqbal, Zamin [1 ]
Nelson, Matthew R. [3 ]
McVean, Gil [1 ]
机构
[1] Univ Oxford, Wellcome Trust Ctr Human Genet, Oxford, England
[2] GlaxoSmithKline, Dept Quantitat Sci, Stevenage, Herts, England
[3] GlaxoSmithKline, Dept Quantitat Sci, Res Triangle Pk, NC USA
基金
英国惠康基金;
关键词
READ ALIGNMENT; HAPLOTYPES; INVERSION; SEQUENCES; DIVERSITY; SNP;
D O I
10.1038/ng.3257
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Although much is known about human genetic variation, such information is typically ignored in assembling new genomes. Instead, reads are mapped to a single reference, which can lead to poor characterization of regions of high sequence or structural diversity. We introduce a population reference graph, which combines multiple reference sequences and catalogs of variation. The genomes of new samples are reconstructed as paths through the graph using an efficient hidden Markov model, allowing for recombination between different haplotypes and additional variants. By applying the method to the 4.5-Mb extended MHC region on human chromosome 6, combining 8 assembled haplotypes, the sequences of known classical HLA alleles and 87,640 SNP variants from the 1000 Genomes Project, we demonstrate using simulations, SNP genotyping, and short-read and long-read data how the method improves the accuracy of genome inference and identified regions where the current set of reference sequences is substantially incomplete.
引用
收藏
页码:682 / 688
页数:7
相关论文
共 30 条
[1]  
Abecasis G.R., 2012, NATURE, V491, P56, DOI DOI 10.1038/nature11632
[2]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[3]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[4]  
[Anonymous], 2012, ARXIV PREPRINT ARXIV
[5]   Structural haplotypes and recent evolution of the human 17q21.31 region [J].
Boettger, Linda M. ;
Handsaker, Robert E. ;
Zody, Michael C. ;
McCarroll, Steven A. .
NATURE GENETICS, 2012, 44 (08) :881-+
[6]   Fast Statistical Alignment [J].
Bradley, Robert K. ;
Roberts, Adam ;
Smoot, Michael ;
Juvekar, Sudeep ;
Do, Jaeyoung ;
Dewey, Colin ;
Holmes, Ian ;
Pachter, Lior .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)
[7]   Variation analysis and gene annotation of eight MHC haplotypes: The MHC haplotype project [J].
Horton, Roger ;
Gibson, Richard ;
Coggill, Penny ;
Miretti, Marcos ;
Allcock, Richard J. ;
Almeida, Jeff ;
Forbes, Simon ;
Gilbert, James G. R. ;
Halls, Karen ;
Harrow, Jennifer L. ;
Hart, Elizabeth ;
Howe, Kevin ;
Jackson, David K. ;
Palmer, Sophie ;
Roberts, Anne N. ;
Sims, Sarah ;
Stewart, C. Andrew ;
Traherne, James A. ;
Trevanion, Steve ;
Wilming, Laurens ;
Rogers, Jane ;
de Jong, Pieter J. ;
Elliott, John F. ;
Sawcer, Stephen ;
Todd, John A. ;
Trowsdale, John ;
Beck, Stephan .
IMMUNOGENETICS, 2008, 60 (01) :1-18
[8]   Short read alignment with populations of genomes [J].
Huang, Lin ;
Popic, Victoria ;
Batzoglou, Serafim .
BIOINFORMATICS, 2013, 29 (13) :361-370
[9]   De novo assembly and genotyping of variants using colored de Bruijn graphs [J].
Iqbal, Zamin ;
Caccamo, Mario ;
Turner, Isaac ;
Flicek, Paul ;
McVean, Gil .
NATURE GENETICS, 2012, 44 (02) :226-232
[10]   Copy number variation leads to considerable diversity for B but not A haplotypes of the human KIR genes encoding NK cell receptors [J].
Jiang, Wei ;
Johnson, Chris ;
Jayaraman, Jyothi ;
Simecek, Nikol ;
Noble, Janelle ;
Moffatt, Miriam F. ;
Cookson, William O. ;
Trowsdale, John ;
Traherne, James A. .
GENOME RESEARCH, 2012, 22 (10) :1845-1854