trioPhaser: using Mendelian inheritance logic to improve genomic phasing of trios

被引:3
作者
Miller, Dustin B. [1 ]
Piccolo, Stephen R. [1 ]
机构
[1] Brigham Young Univ, Dept Biol, Provo, UT 84602 USA
关键词
Haplotyping; Phasing; Trios; Genomics; Next-generation sequencing;
D O I
10.1186/s12859-021-04470-4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background When analyzing DNA sequence data of an individual, knowing which nucleotide was inherited from each parent can be beneficial when trying to identify certain types of DNA variants. Mendelian inheritance logic can be used to accurately phase (haplotype) the majority (67-83%) of an individual's heterozygous nucleotide positions when genotypes are available for both parents (trio). However, when all members of a trio are heterozygous at a position, Mendelian inheritance logic cannot be used to phase. For such positions, a computational phasing algorithm can be used. Existing phasing algorithms use a haplotype reference panel, sequencing reads, and/or parental genotypes to phase an individual; however, they are limited in that they can only phase certain types of variants, require a specific genotype build, require large amounts of storage capacity, and/or require long run times. We created trioPhaser to address these challenges. Results trioPhaser uses gVCF files from an individual and their parents as initial input, and then outputs a phased VCF file. Input trio data are first phased using Mendelian inheritance logic. Then, the positions that cannot be phased using inheritance information alone are phased by the SHAPEIT4 phasing algorithm. Using whole-genome sequencing data of 52 trios, we show that trioPhaser, on average, increases the total number of phased positions by 21.0% and 10.5%, respectively, when compared to the number of positions that SHAPEIT4 or Mendelian inheritance logic can phase when either is used alone. In addition, we show that the accuracy of the phased calls output by trioPhaser are similar to linked-read and read-backed phasing. Conclusion trioPhaser is a containerized software tool that uses both Mendelian inheritance logic and SHAPEIT4 to phase trios when gVCF files are available. By implementing both phasing methods, more variant positions are phased compared to what either method is able to phase alone.
引用
收藏
页数:8
相关论文
共 17 条
  • [1] A global reference for human genetic variation
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Wang, Jun
    Wilson, Richard K.
    Boerwinkle, Eric
    Doddapaneni, Harsha
    Han, Yi
    Korchina, Viktoriya
    Kovar, Christie
    Lee, Sandra
    Muzny, Donna
    Reid, Jeffrey G.
    Zhu, Yiming
    Chang, Yuqi
    Feng, Qiang
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Lan, Tianming
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Liu, Shengmao
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Tang, Meifang
    Wang, Bo
    [J]. NATURE, 2015, 526 (7571) : 68 - +
  • [2] Boettiger Carl, 2015, ACM SIGOPS Operating Systems Review, V49, P71
  • [3] Comparison of phasing strategies for whole human genomes
    Choi, Yongwook
    Chan, Agnes P.
    Kirkness, Ewen
    Telenti, Amalio
    Schork, Nicholas J.
    [J]. PLOS GENETICS, 2018, 14 (04):
  • [4] Accurate, scalable and integrative haplotype estimation
    Delaneau, Olivier
    Zagury, Jean-Francois
    Robinson, Matthew R.
    Marchini, Jonathan L.
    Dermitzakis, Emmanouil T.
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [5] Haplotype Estimation Using Sequencing Reads
    Delaneau, Olivier
    Howie, Bryan
    Cox, Anthony J.
    Zagury, Jean-Francois
    Marchini, Jonathan
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2013, 93 (04) : 687 - 696
  • [7] The Importance of Data Compression in the Field of Genomics
    Greenfield, Dan
    Wittorff, Vaughan
    Hultner, Michael
    [J]. IEEE PULSE, 2019, 10 (02) : 20 - 23
  • [8] Gabriella Miller Kids First Data Resource Center: Harmonizing clinical and genomic data to support childhood cancer and structural birth defect research
    Heath, Allison P.
    Taylor, Deanne M.
    Zhu, Yuankun
    Raman, Pichai
    Lilly, Jena
    Storm, Phillip
    Waanders, Angela J.
    Ferretti, Vincent
    Yung, Christina
    Mattioni, Michele
    Davis-Dusenbery, Brandi
    Flamig, Zachary L.
    Grossman, Robert
    Volchenboum, Samuel L.
    Mueller, Sabine
    Nazarian, Javad
    Vasilevsky, Nicole
    Haendel, Melissa A.
    Resnick, Adam
    [J]. CANCER RESEARCH, 2019, 79 (13)
  • [9] Fast and accurate short read alignment with Burrows-Wheeler transform
    Li, Heng
    Durbin, Richard
    [J]. BIOINFORMATICS, 2009, 25 (14) : 1754 - 1760
  • [10] Martin M., 2016, bioRxiv