Graphtyper enables population-scale genotyping using pangenome graphs

被引:163
作者
Eggertsson, Hannes P. [1 ,2 ]
Jonsson, Hakon [1 ]
Kristmundsdottir, Snaedis [1 ,3 ]
Hjartarson, Eirikur [1 ]
Kehr, Birte [1 ,4 ]
Masson, Gisli [1 ]
Zink, Florian [1 ]
Hjorleifsson, Kristjan E. [1 ]
Jonasdottir, Aslaug [1 ]
Jonasdottir, Adalbjorg [1 ]
Jonsdottir, Ingileif [1 ,5 ]
Gudbjartsson, Daniel F. [1 ,2 ]
Melsted, Pall [1 ,2 ]
Stefansson, Kari [1 ,5 ]
Halldorsson, Bjarni V. [1 ,3 ]
机构
[1] deCODE Genet Amgen Inc, Reykjavik, Iceland
[2] Univ Iceland, Sch Engn & Nat Sci, Reykjavik, Iceland
[3] Reykjavik Univ, Sch Sci & Engn, Reykjavik, Iceland
[4] BIH, Berlin, Germany
[5] Univ Iceland, Sch Hlth Sci, Fac Med, Reykjavik, Iceland
关键词
GENOME SEQUENCE; VARIANTS; FRAMEWORK; MAP;
D O I
10.1038/ng.3964
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A fundamental requirement for genetic studies is an accurate determination of sequence variation. While human genome sequence diversity is increasingly well characterized, there is a need for efficient ways to use this knowledge in sequence analysis. Here we present Graphtyper, a publicly available novel algorithm and software for discovering and genotyping sequence variants. Graphtyper realigns shortread sequence data to a pangenome, a variation-aware graph structure that encodes sequence variation within a population by representing possible haplotypes as graph paths. Our results show that Graphtyper is fast, highly scalable, and provides sensitive and accurate genotype calls. Graphtyper genotyped 89.4 million sequence variants in the whole genomes of 28,075 Icelanders using less than 100,000 CPU days, including detailed genotyping of six human leukocyte antigen (HLA) genes. We show that Graphtyper is a valuable tool in characterizing sequence variation in both small and population-scale sequencing studies.
引用
收藏
页码:1654 / +
页数:9
相关论文
共 36 条
[1]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[2]  
[Anonymous], 2016, Briefings in Bioinformatics, pbbw089
[3]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[4]   Extending reference assembly models [J].
Church, Deanna M. ;
Schneider, Valerie A. ;
Steinberg, Karyn Meltz ;
Schatz, Michael C. ;
Quinlan, Aaron R. ;
Chin, Chen-Shan ;
Kitts, Paul A. ;
Aken, Bronwen ;
Marth, Gabor T. ;
Hoffman, Michael M. ;
Herrero, Javier ;
Mendoza, M. Lisandra Zepeda ;
Durbin, Richard ;
Flicek, Paul .
GENOME BIOLOGY, 2015, 16
[5]   A framework for variation discovery and genotyping using next-generation DNA sequencing data [J].
DePristo, Mark A. ;
Banks, Eric ;
Poplin, Ryan ;
Garimella, Kiran V. ;
Maguire, Jared R. ;
Hartl, Christopher ;
Philippakis, Anthony A. ;
del Angel, Guillermo ;
Rivas, Manuel A. ;
Hanna, Matt ;
McKenna, Aaron ;
Fennell, Tim J. ;
Kernytsky, Andrew M. ;
Sivachenko, Andrey Y. ;
Cibulskis, Kristian ;
Gabriel, Stacey B. ;
Altshuler, David ;
Daly, Mark J. .
NATURE GENETICS, 2011, 43 (05) :491-+
[6]   Improved genome inference in the MHC using a population reference graph [J].
Dilthey, Alexander ;
Cox, Charles ;
Iqbal, Zamin ;
Nelson, Matthew R. ;
McVean, Gil .
NATURE GENETICS, 2015, 47 (06) :682-688
[7]   High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs [J].
Dilthey, Alexander T. ;
Gourraud, Pierre-Antoine ;
Mentzer, Alexander J. ;
Cereb, Nezih ;
Iqbal, Zamin ;
McVean, Gil .
PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (10)
[8]   A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree [J].
Eberle, Michael A. ;
Fritzilas, Epameinondas ;
Krusche, Peter ;
Kallberg, Morten ;
Moore, Benjamin L. ;
Bekritsky, Mitchell A. ;
Iqbal, Zamin ;
Chuang, Han-Yu ;
Humphray, Sean J. ;
Halpern, Aaron L. ;
Kruglyak, Semyon ;
Margulies, Elliott H. ;
McVean, Gil ;
Bentley, David R. .
GENOME RESEARCH, 2017, 27 (01) :157-164
[9]  
Eggertsson H.P., 2015, THESIS
[10]   Whole-genome sequence variation, population structure and demographic history of the Dutch population [J].
Francioli, Laurent C. ;
Menelaou, Andronild ;
Pulit, Sara L. ;
Van Dijk, Freerk ;
Palamara, Pier Francesco ;
Elbers, Clara C. ;
Neerincx, Pieter B. T. ;
Ye, Kai ;
Guryev, Victor ;
Kloosterman, Wigard P. ;
Deelen, Patrick ;
Abdellaoui, Abdel ;
Van Leeuwen, Elisabeth M. ;
Van Oven, Mannis ;
Vermaat, Martijn ;
Li, Mingkun ;
Laros, Jeroen F. J. ;
Karssen, Lennart C. ;
Kanterakis, Alexandros ;
Amin, Najaf ;
Hottenga, Jouke Jan ;
Lameijer, Eric-Wubbo ;
Kattenberg, Mathijs ;
Dijkstra, Martijn ;
Byelas, Heorhiy ;
Van Settenl, Jessica ;
Van Schaik, Barbera D. C. ;
Bot, Jan ;
Nijman, Isaac J. ;
Renkens, Ivo ;
Marscha, Tobias ;
Schonhuth, Alexander ;
Hehir-Kwa, Jayne Y. ;
Handsaker, Robert E. ;
Polak, Paz ;
Sohail, Mashaal ;
Vuzman, Dana ;
Hormozdiari, Fereydoun ;
Van Enckevort, David ;
Mei, Hailiang ;
Koval, Vyacheslav ;
Moed, Ma-Tthijs H. ;
Van der Velde, K. Joeri ;
Rivadeneira, Fernando ;
Estrada, Karol ;
Medina-Gomez, Carolina ;
Isaacs, Aaron ;
McCarroll, Steven A. ;
Beekrnan, Marian ;
De Craen, Anton J. M. .
NATURE GENETICS, 2014, 46 (08) :818-825