Genotype phasing in pedigrees using whole-genome sequence data

被引:3
作者
Blackburn, August N. [1 ,2 ,3 ]
Blondell, Lucy [1 ,2 ]
Kos, Mark Z. [1 ,2 ]
Blackburn, Nicholas B. [1 ,2 ]
Peralta, Juan M. [1 ,2 ]
Stevens, Peter T. [1 ,2 ]
Lehman, Donna M. [4 ]
Blangero, John [1 ,2 ]
Goring, Harald H. H. [1 ,2 ]
机构
[1] Univ Texas Rio Grande Valley, Sch Med, Dept Human Genet, Brownsville, TX 78520 USA
[2] Univ Texas Rio Grande Valley, Sch Med, South Texas Diabet & Obes Inst, Brownsville, TX 78520 USA
[3] St Marys Univ, Dept Biol Sci, San Antonio, TX USA
[4] Univ Texas Hlth Sci Ctr San Antonio, Dept Med, San Antonio, TX 78229 USA
关键词
MEXICAN-AMERICANS; LINKAGE ANALYSIS; INFERENCE; DESCENT; MAPS;
D O I
10.1038/s41431-020-0574-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phasing is the process of inferring haplotypes from genotype data. Efficient algorithms and associated software for accurate phasing in pedigrees are needed, especially for populations lacking reference panels of sequenced individuals. We present a novel method for phasing genotypes from whole-genome sequence data in pedigrees, called PULSAR (Phasing Using Lineage Specific Alleles/Rare variants). The method is based on the property that alleles specific to a single founding chromosome within a pedigree are highly informative for identifying haplotypes that are shared identical by descent. Simulation studies are used to assess the performance of PULSAR with various pedigree sizes and structures, and the effect of genotyping errors and the presence of nonsequenced individuals is investigated. In pedigrees with complete sequencing and realistic genotyping error rates, PULSAR correctly phases >99.9% of heterozygous genotypes, excluding sites at which all individuals are heterozygous, and does so with a switch error rate frequently below 10(-4). PULSAR is highly accurate, capable of genotype error correction and imputation, and computationally competitive with alternative phasing software applicable to pedigrees. Our method has the significant advantage of not requiring reference panels that are essential for other population-based phasing algorithms. A software implementation of PULSAR is freely available.
引用
收藏
页码:790 / 803
页数:14
相关论文
共 25 条
  • [1] Merlin-rapid analysis of dense genetic maps using sparse gene flow trees
    Abecasis, GR
    Cherny, SS
    Cookson, WO
    Cardon, LR
    [J]. NATURE GENETICS, 2002, 30 (01) : 97 - 101
  • [2] A global reference for human genetic variation
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Wang, Jun
    Wilson, Richard K.
    Boerwinkle, Eric
    Doddapaneni, Harsha
    Han, Yi
    Korchina, Viktoriya
    Kovar, Christie
    Lee, Sandra
    Muzny, Donna
    Reid, Jeffrey G.
    Zhu, Yiming
    Chang, Yuqi
    Feng, Qiang
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Lan, Tianming
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Liu, Shengmao
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Tang, Meifang
    Wang, Bo
    [J]. NATURE, 2015, 526 (7571) : 68 - +
  • [3] Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering
    Browning, Sharon R.
    Browning, Brian L.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) : 1084 - 1097
  • [4] Haplotype phasing: existing methods and new developments
    Browning, Sharon R.
    Browning, Brian L.
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (10) : 703 - 714
  • [5] COTTINGHAM RW, 1993, AM J HUM GENET, V53, P252
  • [6] Delaneau O, 2012, NAT METHODS, V9, P179, DOI [10.1038/NMETH.1785, 10.1038/nmeth.1785]
  • [7] GENERAL MODEL FOR GENETIC ANALYSIS OF PEDIGREE DATA
    ELSTON, RC
    STEWART, J
    [J]. HUMAN HEREDITY, 1971, 21 (06) : 523 - &
  • [8] Markov chain Monte Carlo segregation and linkage analysis for oligogenic models
    Heath, SC
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 1997, 61 (03) : 748 - 760
  • [9] A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes
    Hickey, John M.
    Kinghorn, Brian P.
    Tier, Bruce
    Wilson, James F.
    Dunstan, Neil
    van der Werf, Julius H. J.
    [J]. GENETICS SELECTION EVOLUTION, 2011, 43
  • [10] Genome-wide linkage analyses of type 2 diabetes in Mexican Americans -: The San Antonio Family Diabetes/Gallbladder Study
    Hunt, KJ
    Lehman, DM
    Arya, R
    Fowler, S
    Leach, RJ
    Göring, HHH
    Almasy, L
    Blangero, J
    Dyer, TD
    Duggirala, R
    Stern, MP
    [J]. DIABETES, 2005, 54 (09) : 2655 - 2662