Systematic evaluation of error rates and causes in short samples in next-generation sequencing

被引:198
作者
Pfeiffer, Franziska [1 ]
Groeber, Carsten [2 ]
Blank, Michael [2 ]
Haendler, Kristian [3 ,4 ,5 ]
Beyer, Marc [3 ,4 ,5 ,6 ]
Schultze, Joachim L. [3 ,4 ,5 ]
Mayer, Guenter [1 ,7 ]
机构
[1] Univ Bonn, LIMES Inst, Chem Biol, Gerhard Domagk Str 1, D-53121 Bonn, Germany
[2] AptaIT GmbH, Klopferspitz 19A, D-82152 Planegg, Germany
[3] Univ Bonn, LIMES Inst, Genom & Immunoregulat, Carl Troll Str 31, D-53115 Bonn, Germany
[4] German Ctr Neurodegenerat Dis DZNE, Sigmund Freud Str 25, D-53127 Bonn, Germany
[5] Univ Bonn, Platform Single Cell Genom & Epigen, Sigmund Freud Str 25, D-53127 Bonn, Germany
[6] DZNE, Mol Immunol Neurodegenerat, Sigmund Freud Str 27, D-53127 Bonn, Germany
[7] Ctr Aptamer Res & Dev, Gerhard Domagk Str 1, D-53121 Bonn, Germany
基金
欧洲研究理事会;
关键词
SELEX; PLATFORM; APTAMER;
D O I
10.1038/s41598-018-29325-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing (NGS) is the method of choice when large numbers of sequences have to be obtained. While the technique is widely applied, varying error rates have been observed. We analysed millions of reads obtained after sequencing of one single sequence on an Illumina sequencer. According to our analysis, the index-PCR for sample preparation has no effect on the observed error rate, even though PCR is traditionally seen as one of the major contributors to enhanced error rates in NGS. In addition, we observed very persistent pre-phasing effects although the base calling software corrects for these. Removal of shortened sequences abolished these effects and allowed analysis of the actual mutations. The average error rate determined was 0.24 +/- 0.06% per base and the percentage of mutated sequences was found to be 6.4 +/- 1.24%. Constant regions at the 5'-and 3'-end, e.g., primer binding sites used in in vitro selection procedures seem to have no effect on mutation rates and re-sequencing of samples obtains very reproducible results. As phasing effects and other sequencing problems vary between equipment and individual setups, we recommend evaluation of error rates and types to all NGS-users to improve the quality and analysis of NGS data.
引用
收藏
页数:14
相关论文
共 34 条
[1]   Selection of a DNA aptamer against norovirus capsid protein VP1 [J].
Beier, Rico ;
Pahlke, Claudia ;
Quenzel, Philipp ;
Henseleit, Anja ;
Boschke, Elke ;
Cuniberti, Gianaurelio ;
Labudde, Dirk .
FEMS MICROBIOLOGY LETTERS, 2014, 351 (02) :162-169
[2]   Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells [J].
Beltman, Joost B. ;
Urbanus, Jos ;
Velds, Arno ;
van Rooij, Nienke ;
Rohr, Jan C. ;
Naik, Shalin H. ;
Schumacher, Ton N. .
BMC BIOINFORMATICS, 2016, 17
[3]  
Blank M, 2016, METHODS MOL BIOL, V1380, P85, DOI 10.1007/978-1-4939-3197-2_7
[4]   Aptamer Selection Technology and Recent Advances [J].
Blind, Michael ;
Blank, Michael .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2015, 4 :e223
[5]   APTANI: a computational tool to select aptamers through sequence-structure motif analysis of HT-SELEX data [J].
Caroli, J. ;
Taccioli, C. ;
De La Fuente, A. ;
Serafini, P. ;
Bicciato, S. .
BIOINFORMATICS, 2016, 32 (02) :161-164
[6]   AfterQC: automatic filtering, trimming, error removing and quality control for fastq data [J].
Chen, Shifu ;
Huang, Tanxiao ;
Zhou, Yanqing ;
Han, Yue ;
Xu, Mingyan ;
Gu, Jia .
BMC BIOINFORMATICS, 2017, 18
[7]   Systematic evaluation of cell-SELEX enriched aptamers binding to breast cancer cells [J].
Civit, Laia ;
Taghdisi, Seyed Mohammad ;
Jonczyk, Anna ;
Hassel, Silvana K. ;
Groeber, Carsten ;
Blank, Michael ;
Stunden, H. James ;
Beyer, Marc ;
Schultze, Joachim ;
Latz, Eicke ;
Mayer, Guenter .
BIOCHIMIE, 2018, 145 :53-62
[8]   Substantial biases in ultra-short read data sets from high-throughput DNA sequencing [J].
Dohm, Juliane C. ;
Lottaz, Claudio ;
Borodina, Tatiana ;
Himmelbauer, Heinz .
NUCLEIC ACIDS RESEARCH, 2008, 36 (16)
[9]  
Fox Edward J, 2014, Next Gener Seq Appl, V1
[10]   The challenges of sequencing by synthesis [J].
Fuller, Carl W. ;
Middendorf, Lyle R. ;
Benner, Steven A. ;
Church, George M. ;
Harris, Timothy ;
Huang, Xiaohua ;
Jovanovich, Stevan B. ;
Nelson, John R. ;
Schloss, Jeffery A. ;
Schwartz, David C. ;
Vezenov, Dmitri V. .
NATURE BIOTECHNOLOGY, 2009, 27 (11) :1013-1023