Characterization of sequence-specific errors in various next-generation sequencing systems

被引:28
|
作者
Shin, Sunguk [1 ]
Park, Joonhong [1 ]
机构
[1] Yonsei Univ, Dept Civil & Environm Engn, Yonsei Ro 50, Seoul 120749, South Korea
基金
新加坡国家研究基金会;
关键词
DNA; DIVERSITY;
D O I
10.1039/c5mb00750j
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation sequencing (NGS) is a popular method for assessing the molecular diversity of microbial communities without cultivation, for identifying polymorphisms in populations, and for comparing genomes and transcriptomes. However, sequence-specific errors (SSEs) by NGS systems can result in genome mis-assembly, overestimation of diversity in microbial community analyses, and false polymorphism discovery. SSEs can be particularly problematic due to rich microbial biodiversity and genomes containing frequent repeats. In this study, SSEs in public data from all popular NGS systems were discovered using a Markov chain model and hotspots for sequence errors were identified. Deletion errors were frequently preceded by homopolymers in non-Illumina NGS systems, such as GS FLX+. Substitution errors were often related to high GC contents and long G/C homopolymers in Illumina sequencing systems such as HiSeq. After removal of long G/C homopolymers in HiSeq, the average lengths of contigs and average SNP quality increased. SSEs were selectively removed from our mock community data by quality filtering, and a bias against specific microbes was identified. Our findings provide a scientific basis for filtering poor-quality reads, correcting deletion errors, preventing genome mis-assembly, and accurately assessing microbial community compositions and polymorphisms.
引用
收藏
页码:914 / 922
页数:9
相关论文
共 50 条
  • [31] Next-generation DNA sequencing
    Shendure, Jay
    Ji, Hanlee
    NATURE BIOTECHNOLOGY, 2008, 26 (10) : 1135 - 1145
  • [32] Advancements in Next-Generation Sequencing
    Levy, Shawn E.
    Myers, Richard M.
    ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, VOL 17, 2016, 17 : 95 - 115
  • [33] Next-Generation Sequencing Challenges
    Baker S.C.
    2017, Mary Ann Liebert Inc. (37): : 1and14 - 15
  • [34] Next-Generation Sequencing Technologies
    McCombie, W. Richard
    McPherson, John D.
    Mardis, Elaine R.
    COLD SPRING HARBOR PERSPECTIVES IN MEDICINE, 2019, 9 (11):
  • [35] Next-generation sequencing: The race is on
    von Bubnoff, Andreas
    CELL, 2008, 132 (05) : 721 - 723
  • [36] Combinatorics and next-generation sequencing
    Patterson, Nick
    Gabriel, Stacey
    NATURE BIOTECHNOLOGY, 2009, 27 (09) : 826 - 827
  • [37] Next-generation sequencing in ophthalmology
    Wolf, Julian
    Lange, Clemens
    Reinhard, Thomas
    Schlunck, Guenther
    SPEKTRUM DER AUGENHEILKUNDE, 2024, 38 (06) : 260 - 270
  • [38] Next-Generation Sequencing in Cancer
    S. Vinod Nair
    Gigi Madhulaxmi
    Ravindran Thomas
    Journal of Maxillofacial and Oral Surgery, 2021, 20 : 340 - 344
  • [39] NEXT-GENERATION SEQUENCING, THE BASICS
    Corbett, Mark
    CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2011, 39 : 89 - 89
  • [40] Next-generation sequencing and norovirus
    Cotten, Matthew
    Koopmans, Marion
    FUTURE VIROLOGY, 2016, 11 (11) : 719 - 722