Evaluation of Allele Frequency Estimation Using Pooled Sequencing Data Simulation

被引:11
作者
Guo, Yan [1 ]
Samuels, David C. [2 ]
Li, Jiang [1 ]
Clark, Travis [3 ]
Li, Chung-I [1 ]
Shyr, Yu [1 ]
机构
[1] Vanderbilt Ingram Canc Ctr, Ctr Quantitat Sci, Nashville, TN USA
[2] Vanderbilt Univ, Med Ctr, Ctr Human Genet Res, Nashville, TN USA
[3] Vanderbilt Univ, VANTAGE, Nashville, TN USA
来源
SCIENTIFIC WORLD JOURNAL | 2013年
关键词
GENOME-WIDE ASSOCIATION; LINKAGE DISEQUILIBRIUM; SYNDROME LOCUS; RARE VARIANTS; HUMAN-DISEASE; DNA SAMPLES; GENES; IDENTIFICATION; POPULATIONS; MUTATIONS;
D O I
10.1155/2013/895496
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing (NGS) technology has provided researchers with opportunities to study the genome in unprecedented detail. In particular, NGS is applied to disease association studies. Unlike genotyping chips, NGS is not limited to a fixed set of SNPs. Prices for NGS are now comparable to the SNP chip, although for large studies the cost can be substantial. Pooling techniques are often used to reduce the overall cost of large-scale studies. In this study, we designed a rigorous simulation model to test the practicability of estimating allele frequency from pooled sequencing data. We took crucial factors into consideration, including pool size, overall depth, average depth per sample, pooling variation, and sampling variation. We used real data to demonstrate and measure reference allele preference in DNAseq data and implemented this bias in our simulation model. We found that pooled sequencing data can introduce high levels of relative error rate (defined as error rate divided by targeted allele frequency) and that the error rate is more severe for low minor allele frequency SNPs than for high minor allele frequency SNPs. In order to overcome the error introduced by pooling, we recommend a large pool size and high average depth per sample.
引用
收藏
页数:9
相关论文
共 33 条
  • [1] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [2] DNA pooling in mutation detection with reference to sequence analysis
    Amos, CI
    Frazier, ML
    Wang, WF
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2000, 66 (05) : 1689 - 1692
  • [3] USE OF POOLED DNA SAMPLES TO DETECT LINKAGE DISEQUILIBRIUM OF POLYMORPHIC RESTRICTION FRAGMENTS AND HUMAN-DISEASE - STUDIES OF THE HLA CLASS-II LOCI
    ARNHEIM, N
    STRANGE, C
    ERLICH, H
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1985, 82 (20) : 6970 - 6974
  • [4] Association mapping of disease loci, by use of a pooled DNA genomic screen
    Barcellos, LF
    Klitz, W
    Field, LL
    Tobias, R
    Bowcock, AM
    Wilson, R
    Nelson, MP
    Nagatomi, J
    Thomson, G
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 1997, 61 (03) : 734 - 747
  • [5] Next Generation Sequencing of Pooled Samples Reveals New SNRNP200 Mutations Associated with Retinitis Pigmentosa
    Benaglio, Paola
    McGee, Terri L.
    Capelli, Leonardo P.
    Harper, Shyana
    Berson, Eliot L.
    Rivolta, Carlo
    [J]. HUMAN MUTATION, 2011, 32 (06) : E2246 - E2258
  • [6] USE OF A DNA POOLING STRATEGY TO IDENTIFY A HUMAN OBESITY SYNDROME LOCUS ON CHROMOSOME-15
    CARMI, R
    ROKHLINA, T
    KWITEKBLACK, AE
    ELBEDOUR, K
    NISHIMURA, D
    STONE, EM
    SHEFFIELD, VC
    [J]. HUMAN MOLECULAR GENETICS, 1995, 4 (01) : 9 - 13
  • [7] Biases and Errors on Allele Frequency Estimation and Disease Association Tests of Next-Generation Sequencing of Pooled Samples
    Chen, Xiaowei
    Listman, Jennifer B.
    Slack, Frank J.
    Gelernter, Joel
    Zhao, Hongyu
    [J]. GENETIC EPIDEMIOLOGY, 2012, 36 (06) : 549 - 560
  • [8] A simple method for analyzing microsatellite allele image patterns generated from DNA pools and its application to allelic association studies
    Daniels, J
    Holmans, P
    Williams, N
    Turic, D
    McGuffin, P
    Plomin, R
    Owen, MJ
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 1998, 62 (05) : 1189 - 1197
  • [9] An Evaluation of Different Target Enrichment Methods in Pooled Sequencing Designs for Complex Disease Association Studies
    Day-Williams, Aaron G.
    McLay, Kirsten
    Drury, Eleanor
    Edkins, Sarah
    Coffey, Alison J.
    Palotie, Aarno
    Zeggini, Eleftheria
    [J]. PLOS ONE, 2011, 6 (11):
  • [10] Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data
    Degner, Jacob F.
    Marioni, John C.
    Pai, Athma A.
    Pickrell, Joseph K.
    Nkadori, Everlyne
    Gilad, Yoav
    Pritchard, Jonathan K.
    [J]. BIOINFORMATICS, 2009, 25 (24) : 3207 - 3212