Characterization of sequence-specific errors in various next-generation sequencing systems

被引:28
|
作者
Shin, Sunguk [1 ]
Park, Joonhong [1 ]
机构
[1] Yonsei Univ, Dept Civil & Environm Engn, Yonsei Ro 50, Seoul 120749, South Korea
基金
新加坡国家研究基金会;
关键词
DNA; DIVERSITY;
D O I
10.1039/c5mb00750j
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation sequencing (NGS) is a popular method for assessing the molecular diversity of microbial communities without cultivation, for identifying polymorphisms in populations, and for comparing genomes and transcriptomes. However, sequence-specific errors (SSEs) by NGS systems can result in genome mis-assembly, overestimation of diversity in microbial community analyses, and false polymorphism discovery. SSEs can be particularly problematic due to rich microbial biodiversity and genomes containing frequent repeats. In this study, SSEs in public data from all popular NGS systems were discovered using a Markov chain model and hotspots for sequence errors were identified. Deletion errors were frequently preceded by homopolymers in non-Illumina NGS systems, such as GS FLX+. Substitution errors were often related to high GC contents and long G/C homopolymers in Illumina sequencing systems such as HiSeq. After removal of long G/C homopolymers in HiSeq, the average lengths of contigs and average SNP quality increased. SSEs were selectively removed from our mock community data by quality filtering, and a bias against specific microbes was identified. Our findings provide a scientific basis for filtering poor-quality reads, correcting deletion errors, preventing genome mis-assembly, and accurately assessing microbial community compositions and polymorphisms.
引用
收藏
页码:914 / 922
页数:9
相关论文
共 50 条
  • [1] Next-generation sequencing for inborn errors of immunity
    Lee, Kristy
    Abraham, Roshini S.
    HUMAN IMMUNOLOGY, 2021, 82 (11) : 871 - 882
  • [2] Comparison of Next-Generation Sequencing Systems
    Liu, Lin
    Li, Yinhu
    Li, Siliang
    Hu, Ni
    He, Yimin
    Pong, Ray
    Lin, Danni
    Lu, Lihua
    Law, Maggie
    JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2012,
  • [3] Next-generation sequencing of the next generation
    Darren J. Burgess
    Nature Reviews Genetics, 2011, 12 : 78 - 79
  • [4] A repetitive sequence assembler based on next-generation sequencing
    Lian, S.
    Tu, Y.
    Wang, Y.
    Chen, X.
    Wang, L.
    GENETICS AND MOLECULAR RESEARCH, 2016, 15 (03)
  • [5] A survey of sequence alignment algorithms for next-generation sequencing
    Li, Heng
    Homer, Nils
    BRIEFINGS IN BIOINFORMATICS, 2010, 11 (05) : 473 - 483
  • [6] Next-generation sequencing for next-generation breeding, and more
    Tsai, Chung-Jui
    NEW PHYTOLOGIST, 2013, 198 (03) : 635 - 637
  • [7] Next-Generation Sequencing: Next-Generation Quality in Pediatrics
    Wortmann, Saskia B.
    Spenger, Johannes
    Preisel, Martin
    Koch, Johannes
    Rauscher, Christian
    Bader, Ingrid
    Mayr, Johannes A.
    Sperl, Wolfgang
    PADIATRIE UND PADOLOGIE, 2018, 53 (06): : 278 - 283
  • [8] Next-Generation Sequencing Demands Next-Generation Phenotyping
    Hennekam, Raoul C. M.
    Biesecker, Leslie G.
    HUMAN MUTATION, 2012, 33 (05) : 884 - 886
  • [9] Next-Generation Sequencing
    Xiong, Momiao
    Zhao, Zhongming
    Arnold, Jonathan
    Yu, Fuli
    JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2010,
  • [10] Next-generation sequencing
    Haferlach, T.
    ONCOLOGY RESEARCH AND TREATMENT, 2016, 39 : 40 - 41