Fast and accurate genomic analyses using genome graphs

被引:133
作者
Rakocevic, Goran [1 ,2 ]
Semenyuk, Vladimir [1 ,2 ]
Lee, Wan-Ping [1 ]
Spencer, James [1 ,2 ]
Browning, John [1 ,2 ]
Johnson, Ivan J. [1 ,2 ]
Arsenijevic, Vladan [1 ,2 ]
Nadj, Jelena [1 ,2 ]
Ghose, Kaushik [1 ,2 ]
Suciu, Maria C. [1 ,2 ]
Ji, Sun-Gou [1 ,2 ]
Demir, Gulfem [1 ,2 ]
Li, Lizao [1 ,2 ]
Toptas, Berke C. [1 ,2 ]
Dolgoborodov, Alexey [1 ]
Pollex, Bjorn [1 ,2 ]
Spulber, Iosif [1 ]
Glotova, Irina [1 ,2 ]
Komar, Peter [1 ,2 ]
Stachyra, Andrew L. [1 ,2 ]
Li, Yilong [1 ,2 ]
Popovic, Milos [1 ,2 ]
Kallberg, Morten [1 ]
Jain, Amit [1 ,2 ]
Kural, Deniz [1 ,2 ]
机构
[1] Seven Bridges Genom Inc, Cambridge, MA 02129 USA
[2] Totient Inc, Cambridge, MA 02140 USA
关键词
SHORT READ ALIGNMENT; DISCOVERY; SEQUENCE; PROVIDES; QUALITY; LOCI; MAP;
D O I
10.1038/s41588-018-0316-4
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5 h using a system with 36 CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.
引用
收藏
页码:354 / +
页数:11
相关论文
共 47 条
[1]   APPLICATIONS OF NEXT-GENERATION SEQUENCING Genome structural variation discovery and genotyping [J].
Alkan, Can ;
Coe, Bradley P. ;
Eichler, Evan E. .
NATURE REVIEWS GENETICS, 2011, 12 (05) :363-375
[2]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[3]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[4]  
[Anonymous], 2017, BIORXIV, DOI DOI 10.1101/101378
[5]   SV2 : accurate structural variation genotyping and de novo mutation detection from whole genomes [J].
Antaki, Danny ;
Brandler, William M. ;
Sebat, Jonathan .
BIOINFORMATICS, 2018, 34 (10) :1774-1777
[6]   Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture [J].
Berndt, Sonja I. ;
Gustafsson, Stefan ;
Maegi, Reedik ;
Ganna, Andrea ;
Wheeler, Eleanor ;
Feitosa, Mary F. ;
Justice, Anne E. ;
Monda, Keri L. ;
Croteau-Chonka, Damien C. ;
Day, Felix R. ;
Esko, Tonu ;
Fall, Tove ;
Ferreira, Teresa ;
Gentilini, Davide ;
Jackson, Anne U. ;
Luan, Jian'an ;
Randall, Joshua C. ;
Vedantam, Sailaja ;
Willer, Cristen J. ;
Winkler, Thomas W. ;
Wood, Andrew R. ;
Workalemahu, Tsegaselassie ;
Hu, Yi-Juan ;
Lee, Sang Hong ;
Liang, Liming ;
Lin, Dan-Yu ;
Min, Josine L. ;
Neale, Benjamin M. ;
Thorleifsson, Gudmar ;
Yang, Jian ;
Albrecht, Eva ;
Amin, Najaf ;
Bragg-Gresham, Jennifer L. ;
Cadby, Gemma ;
den Heijer, Martin ;
Eklund, Niina ;
Fischer, Krista ;
Goel, Anuj ;
Hottenga, Jouke-Jan ;
Huffman, Jennifer E. ;
Jarick, Ivonne ;
Johansson, Asa ;
Johnson, Toby ;
Kanoni, Stavroula ;
Kleber, Marcus E. ;
Koenig, Inke R. ;
Kristiansson, Kati ;
Kutalik, Zoltn ;
Lamina, Claudia ;
Lecoeur, Cecile .
NATURE GENETICS, 2013, 45 (05) :501-U69
[7]   Mapping Bias Overestimates Reference Allele Frequencies at the HLA Genes in the 1000 Genomes Project Phase I Data [J].
Brandt, Debora Y. C. ;
Aguiar, Vitor R. C. ;
Bitarello, Barbara D. ;
Nunes, Kelly ;
Goudet, Jerome ;
Meyer, Diogo .
G3-GENES GENOMES GENETICS, 2015, 5 (05) :931-941
[8]   Modernizing Reference Genome Assemblies [J].
Church, Deanna M. ;
Schneider, Valerie A. ;
Graves, Tina ;
Auger, Katherine ;
Cunningham, Fiona ;
Bouk, Nathan ;
Chen, Hsiu-Chuan ;
Agarwala, Richa ;
McLaren, William M. ;
Ritchie, Graham R. S. ;
Albracht, Derek ;
Kremitzki, Milinn ;
Rock, Susan ;
Kotkiewicz, Holland ;
Kremitzki, Colin ;
Wollam, Aye ;
Trani, Lee ;
Fulton, Lucinda ;
Fulton, Robert ;
Matthews, Lucy ;
Whitehead, Siobhan ;
Chow, Will ;
Torrance, James ;
Dunn, Matthew ;
Harden, Glenn ;
Threadgold, Glen ;
Wood, Jonathan ;
Collins, Joanna ;
Heath, Paul ;
Griffiths, Guy ;
Pelan, Sarah ;
Grafham, Darren ;
Eichler, Evan E. ;
Weinstock, George ;
Mardis, Elaine R. ;
Wilson, Richard K. ;
Howe, Kerstin ;
Flicek, Paul ;
Hubbard, Tim .
PLOS BIOLOGY, 2011, 9 (07)
[9]  
Cleary J. G., 2015, BIORXIV, DOI DOI 10.1101/023754
[10]   Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data [J].
Degner, Jacob F. ;
Marioni, John C. ;
Pai, Athma A. ;
Pickrell, Joseph K. ;
Nkadori, Everlyne ;
Gilad, Yoav ;
Pritchard, Jonathan K. .
BIOINFORMATICS, 2009, 25 (24) :3207-3212