Fast and accurate haplotype frequency estimation for large haplotype vectors from pooled DNA data

被引:8
|
作者
Iliadis, Alexandros
Anastassiou, Dimitris
Wang, Xiaodong [1 ]
机构
[1] Columbia Univ, Ctr Computat Biol & Bioinformat, New York, NY 10027 USA
来源
BMC GENETICS | 2012年 / 13卷
关键词
LARGE-SCALE ASSOCIATION; LINKAGE-DISEQUILIBRIUM; POPULATION; IDENTIFICATION; INFORMATION; EFFICIENCY; INFERENCE; SCREEN; TOOL;
D O I
10.1186/1471-2156-13-94
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background: Typically, the first phase of a genome wide association study (GWAS) includes genotyping across hundreds of individuals and validation of the most significant SNPs. Allelotyping of pooled genomic DNA is a common approach to reduce the overall cost of the study. Knowledge of haplotype structure can provide additional information to single locus analyses. Several methods have been proposed for estimating haplotype frequencies in a population from pooled DNA data. Results: We introduce a technique for haplotype frequency estimation in a population from pooled DNA samples focusing on datasets containing a small number of individuals per pool (2 or 3 individuals) and a large number of markers. We compare our method with the publicly available state-of-the-art algorithms HIPPO and HAPLOPOOL on datasets of varying number of pools and marker sizes. We demonstrate that our algorithm provides improvements in terms of accuracy and computational time over competing methods for large number of markers while demonstrating comparable performance for smaller marker sizes. Our method is implemented in the "Tree-Based Deterministic Sampling Pool" (TDSPool) package which is available for download at www.ee.columbia.edu/similar to anastas/tdspool. Conclusions: Using a tree-based determinstic sampling technique we present an algorithm for haplotype frequency estimation from pooled data. Our method demonstrates superior performance in datasets with large number of markers and could be the method of choice for haplotype frequency estimation in such datasets.
引用
收藏
页数:10
相关论文
共 50 条
  • [11] Estimation of haplotype frequencies and diplotype configuration for each subject using pooled DNA data.
    Ito, T
    Chiku, S
    Inoue, E
    Tomita, M
    Kamatani, N
    AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (04) : 449 - 449
  • [12] Accurate, scalable and integrative haplotype estimation
    Delaneau, Olivier
    Zagury, Jean-Francois
    Robinson, Matthew R.
    Marchini, Jonathan L.
    Dermitzakis, Emmanouil T.
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [13] Accurate, scalable and integrative haplotype estimation
    Olivier Delaneau
    Jean-François Zagury
    Matthew R. Robinson
    Jonathan L. Marchini
    Emmanouil T. Dermitzakis
    Nature Communications, 10
  • [14] HybHap: A Fast and Accurate Hybrid Approach for Haplotype Inference on Large Datasets
    Rosa, Rogerio S.
    Guimaraes, Katia S.
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2013, 8213 : 24 - 35
  • [15] Malaria haplotype frequency estimation
    Wigger, Leonore
    Vogt, Julia E.
    Roth, Volker
    STATISTICS IN MEDICINE, 2013, 32 (21) : 3737 - 3751
  • [16] Heuristics for haplotype frequency estimation with a large number of analyzed loci
    Nowotka, Michal
    Nowak, Robert
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2012, 2012, 8454
  • [17] A fast haplotype inference method for large population genotype data
    Zhang, Ji-Hong
    Wu, Ling-Yun
    Chen, Jian
    Zhang, Xiang-Sun
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (11) : 4891 - 4902
  • [18] Computationally feasible estimation of haplotype frequencies from pooled DNA with and without Hardy-Weinberg equilibrium
    Kuk, Anthony Y. C.
    Zhang, Han
    Yang, Yaning
    BIOINFORMATICS, 2009, 25 (03) : 379 - 386
  • [19] Validation of haplotype frequency estimation methods
    Schipper, RF
    D'Amaro, J
    de Lange, P
    Schreuder, GMT
    van Rood, JJ
    Oudshoorn, M
    HUMAN IMMUNOLOGY, 1998, 59 (08) : 518 - 523
  • [20] HapCompass: A Fast Cycle Basis Algorithm for Accurate Haplotype Assembly of Sequence Data
    Aguiar, Derek
    Istrail, Sorin
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (06) : 577 - 590