Sources of PCR-induced distortions in high-throughput sequencing data sets

被引:172
作者
Kebschull, Justus M. [1 ,2 ]
Zador, Anthony M. [2 ]
机构
[1] Cold Spring Harbor Lab, Watson Sch Biol Sci, Cold Spring Harbor, NY 11724 USA
[2] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
基金
美国国家卫生研究院;
关键词
DNA-SYNTHESIS; RNA-SEQ; AMPLIFICATION; NOISE; MOLECULES; FIDELITY; LENGTH; BIASES;
D O I
10.1093/nar/gkv717
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error-bias, stochasticity, template switches and polymerase errors-on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules.
引用
收藏
页数:15
相关论文
共 33 条
  • [1] Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries
    Aird, Daniel
    Ross, Michael G.
    Chen, Wei-Sheng
    Danielsson, Maxwell
    Fennell, Timothy
    Russ, Carsten
    Jaffe, David B.
    Nusbaum, Chad
    Gnirke, Andreas
    [J]. GENOME BIOLOGY, 2011, 12 (02)
  • [2] Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/NMETH.2645, 10.1038/nmeth.2645]
  • [3] A method for counting PCR template molecules with application to next-generation sequencing
    Casbon, James A.
    Osborne, Robert J.
    Brenner, Sydney
    Lichtenstein, Conrad P.
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 (12) : e81
  • [4] Length and GC-biases during sequencing library amplification: A comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries
    Dabney, Jesse
    Meyer, Matthias
    [J]. BIOTECHNIQUES, 2012, 52 (02) : 87 - +
  • [5] Comprehensive human genome amplification using multiple displacement amplification
    Dean, FB
    Hosono, S
    Fang, LH
    Wu, XH
    Faruqi, AF
    Bray-Ward, P
    Sun, ZY
    Zong, QL
    Du, YF
    Du, J
    Driscoll, M
    Song, WM
    Kingsmore, SF
    Egholm, M
    Lasken, RS
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (08) : 5261 - 5266
  • [6] Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
    Dohm, Juliane C.
    Lottaz, Claudio
    Borodina, Tatiana
    Himmelbauer, Heinz
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 (16)
  • [7] Eckert K A, 1991, PCR Methods Appl, V1, P17
  • [8] Counting individual DNA molecules by the stochastic attachment of diverse labels
    Fu, Glenn K.
    Hu, Jing
    Wang, Pei-Hua
    Fodor, Stephen P. A.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (22) : 9026 - 9031
  • [9] Grün D, 2014, NAT METHODS, V11, P637, DOI [10.1038/NMETH.2930, 10.1038/nmeth.2930]
  • [10] Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons
    Haas, Brian J.
    Gevers, Dirk
    Earl, Ashlee M.
    Feldgarden, Mike
    Ward, Doyle V.
    Giannoukos, Georgia
    Ciulla, Dawn
    Tabbaa, Diana
    Highlander, Sarah K.
    Sodergren, Erica
    Methe, Barbara
    DeSantis, Todd Z.
    Petrosino, Joseph F.
    Knight, Rob
    Birren, Bruce W.
    [J]. GENOME RESEARCH, 2011, 21 (03) : 494 - 504