Fast and accurate HLA typing from short-read next-generation sequence data with xHLA

被引:103
作者
Xie, Chao [1 ]
Yeo, Zhen Xuan [1 ]
Wong, Marie [1 ]
Piper, Jason [1 ]
Long, Tao [2 ]
Kirkness, Ewen F. [2 ]
Biggs, William H. [2 ]
Bloom, Ken [2 ]
Spellman, Stephen [3 ]
Vierra-Green, Cynthia [3 ]
Brady, Colleen [3 ]
Scheuermann, Richard H. [4 ,5 ]
Telenti, Amalio [2 ]
Howard, Sally [2 ]
Brewerton, Suzanne [1 ]
Turpaz, Yaron [1 ,2 ]
Venter, J. Craig [2 ,4 ]
机构
[1] Human Longev Singapore Pte Ltd, Singapore 138543, Singapore
[2] Human Longev Inc, San Diego, CA 92121 USA
[3] Ctr Int Blood & Marrow Transplant Res, Minneapolis, MN 55401 USA
[4] J Craig Venter Inst, La Jolla, CA 92037 USA
[5] Univ Calif San Diego, Dept Pathol, La Jolla, CA 92093 USA
关键词
MHC; autoimmune diseases; transplantation; HUMAN-LEUKOCYTE ANTIGEN; IMGT/HLA DATABASE; ALIGNMENT; DISEASE; ASSOCIATIONS; DIVERSITY; SELECTION; DONORS;
D O I
10.1073/pnas.1707945114
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The HLA gene complex on human chromosome 6 is one of the most polymorphic regions in the human genome and contributes in large part to the diversity of the immune system. Accurate typing of HLA genes with short-read sequencing data has historically been difficult due to the sequence similarity between the polymorphic alleles. Here, we introduce an algorithm, xHLA, that iteratively refines the mapping results at the amino acid level to achieve 99-100% four-digit typing accuracy for both class I and II HLA genes, taking only similar to 3 min to process a 30 x whole-genome BAM file on a desktop computer.
引用
收藏
页码:8059 / 8064
页数:6
相关论文
共 24 条
[1]   Fast and sensitive protein alignment using DIAMOND [J].
Buchfink, Benjamin ;
Xie, Chao ;
Huson, Daniel H. .
NATURE METHODS, 2015, 12 (01) :59-60
[2]   Improved genome inference in the MHC using a population reference graph [J].
Dilthey, Alexander ;
Cox, Charles ;
Iqbal, Zamin ;
Nelson, Matthew R. ;
McVean, Gil .
NATURE GENETICS, 2015, 47 (06) :682-688
[3]   High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs [J].
Dilthey, Alexander T. ;
Gourraud, Pierre-Antoine ;
Mentzer, Alexander J. ;
Cereb, Nezih ;
Iqbal, Zamin ;
McVean, Gil .
PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (10)
[4]   MUSCLE: a multiple sequence alignment method with reduced time and space complexity [J].
Edgar, RC .
BMC BIOINFORMATICS, 2004, 5 (1) :1-19
[5]  
FDA, 2016, TABL PHARM BIOM DRUG
[6]   Allele frequency net 2015 update: new features for HLA epitopes, KIR and disease and HLA adverse drug reaction associations [J].
Gonzalez-Galarza, Faviel F. ;
Takeshita, Louise Y. C. ;
Santos, Eduardo J. M. ;
Kempson, Felicity ;
Thomaz Maia, Maria Helena ;
Soares da Silva, Andrea Luciana ;
Teles e Silva, Andre Luiz ;
Ghattaoraya, Gurpreet S. ;
Alfirevic, Ana ;
Jones, Andrew R. ;
Middleton, Derek .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D784-D788
[7]  
Gough SCL, 2007, CURR GENOMICS, V8, P453, DOI 10.2174/138920207783591690
[8]   HLA Diversity in the 1000 Genomes Dataset [J].
Gourraud, Pierre-Antoine ;
Khankhanian, Pouya ;
Cereb, Nezih ;
Yang, Soo Young ;
Feolo, Michael ;
Maiers, Martin ;
Rioux, John D. ;
Hauser, Stephen ;
Oksenberg, Jorge .
PLOS ONE, 2014, 9 (07)
[9]   Fast and accurate long-read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2010, 26 (05) :589-595
[10]   ATHLATES: accurate typing of human leukocyte antigen through exome sequencing [J].
Liu, Chang ;
Yang, Xiao ;
Duffy, Brian ;
Mohanakumar, Thalachallour ;
Mitra, Robi D. ;
Zody, Michael C. ;
Pfeifer, John D. .
NUCLEIC ACIDS RESEARCH, 2013, 41 (14) :e142