GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms

被引:100
作者
Browne, Patrick Denis [1 ,2 ]
Nielsen, Tue Kjaergaard [1 ,2 ]
Kot, Witold [1 ,2 ]
Aggerholm, Anni [3 ]
Gilbert, M. Thomas P. [4 ]
Puetz, Lara [4 ]
Rasmussen, Morten [5 ]
Zervas, Athanasios [2 ]
Hansen, Lars Hestbjerg [1 ,2 ]
机构
[1] Univ Copenhagen, Dept Plant & Environm Sci, Thorvaldsensvej 40, DK-1871 Frederiksberg C, Denmark
[2] Aarhus Univ, Dept Environm Sci, Frederiksborgvej 399, DK-4000 Roskilde, Denmark
[3] Aarhus Univ Hosp, Dept Hematol, Palle Juul Jensens Blvd 99, DK-8200 Aarhus N, Denmark
[4] Univ Copenhagen, Fac Hlth & Biomed Sci, GLOBE Inst, Blegdamsvej 3B, DK-2200 Copenhagen N, Denmark
[5] Stanford Univ, Sch Med, Dept Genet, 291 Campus Dr, Stanford, CA 94305 USA
来源
GIGASCIENCE | 2020年 / 9卷 / 02期
关键词
GC bias; high-throughput sequencing; metagenomics; Illumina; Oxford Nanopore; PacBio; SINGLE-CELL; MICROBIOME; ASSEMBLER; DYNAMICS;
D O I
10.1093/gigascience/giaa008
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Metagenomic sequencing is a well-established tool in the modern biosciences. While it promises unparalleled insights into the genetic content of the biological samples studied, conclusions drawn are at risk from biases inherent to the DNA sequencing methods, including inaccurate abundance estimates as a function of genomic guanine-cytosine (GC) contents. Results: We explored such GC biases across many commonly used platforms in experiments sequencing multiple genomes (with mean GC contents ranging from 28.9% to 62.4%) and metagenomes. GC bias profiles varied among different library preparation protocols and sequencing platforms. We found that our workflows using MiSeq and NextSeq were hindered by major GC biases, with problems becoming increasingly severe outside the 45-65% GC range, leading to a falsely low coverage in GC-rich and especially GC-poor sequences, where genomic windows with 30% GC content had >10-fold less coverage than windows close to 50% GC content. We also showed that GC content correlates tightly with coverage biases. The PacBio and HiSeq platforms also evidenced similar profiles of GC biases to each other, which were distinct from those seen in the MiSeq and NextSeq workflows. The Oxford Nanopore workflow was not afflicted by GC bias. Conclusions: These findings indicate potential sources of difficulty, arising from GC biases, in genome sequencing that could be pre-emptively addressed with methodological optimizations provided that the GC biases inherent to the relevant workflow are understood. Furthermore, it is recommended that a more critical approach be taken in quantitative abundance estimates in metagenomic studies. In the future, metagenomic studies should take steps to account for the effects of GC bias before drawing conclusions, or they should use a demonstrably unbiased workflow.
引用
收藏
页数:14
相关论文
共 39 条
  • [1] Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries
    Aird, Daniel
    Ross, Michael G.
    Chen, Wei-Sheng
    Danielsson, Maxwell
    Fennell, Timothy
    Russ, Carsten
    Jaffe, David B.
    Nusbaum, Chad
    Gnirke, Andreas
    [J]. GENOME BIOLOGY, 2011, 12 (02)
  • [2] BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons
    Alikhan, Nabil-Fareed
    Petty, Nicola K.
    Ben Zakour, Nouri L.
    Beatson, Scott A.
    [J]. BMC GENOMICS, 2011, 12
  • [3] The RAST server: Rapid annotations using subsystems technology
    Aziz, Ramy K.
    Bartels, Daniela
    Best, Aaron A.
    DeJongh, Matthew
    Disz, Terrence
    Edwards, Robert A.
    Formsma, Kevin
    Gerdes, Svetlana
    Glass, Elizabeth M.
    Kubal, Michael
    Meyer, Folker
    Olsen, Gary J.
    Olson, Robert
    Osterman, Andrei L.
    Overbeek, Ross A.
    McNeil, Leslie K.
    Paarmann, Daniel
    Paczian, Tobias
    Parrello, Bruce
    Pusch, Gordon D.
    Reich, Claudia
    Stevens, Rick
    Vassieva, Olga
    Vonstein, Veronika
    Wilke, Andreas
    Zagnitko, Olga
    [J]. BMC GENOMICS, 2008, 9 (1)
  • [4] Bäckhed F, 2015, CELL HOST MICROBE, V17, P690, DOI [10.1016/j.chom.2015.04.004, 10.1016/j.chom.2015.05.012]
  • [5] SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
    Bankevich, Anton
    Nurk, Sergey
    Antipov, Dmitry
    Gurevich, Alexey A.
    Dvorkin, Mikhail
    Kulikov, Alexander S.
    Lesin, Valery M.
    Nikolenko, Sergey I.
    Son Pham
    Prjibelski, Andrey D.
    Pyshkin, Alexey V.
    Sirotkin, Alexander V.
    Vyahhi, Nikolay
    Tesler, Glenn
    Alekseyev, Max A.
    Pevzner, Pavel A.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) : 455 - 477
  • [6] Summarizing and correcting the GC content bias in high-throughput sequencing
    Benjamini, Yuval
    Speed, Terence P.
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (10) : e72
  • [7] Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community
    Bowers, Robert M.
    Clum, Alicia
    Tice, Hope
    Lim, Joanne
    Singh, Kanwar
    Ciobanu, Doina
    Ngan, Chew Yee
    Cheng, Jan-Fang
    Tringe, Susannah G.
    Woyke, Tanja
    [J]. BMC GENOMICS, 2015, 16
  • [8] The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies
    Brooks, J. Paul
    Edwards, David J.
    Harwich, Michael D., Jr.
    Rivera, Maria C.
    Fettweis, Jennifer M.
    Serrano, Myrna G.
    Reris, Robert A.
    Sheth, Nihar U.
    Huang, Bernice
    Girerd, Philippe
    Strauss, Jerome F., III
    Jefferson, Kimberly K.
    Buck, Gregory A.
    [J]. BMC MICROBIOLOGY, 2015, 15
  • [9] Genomic composition and dynamics among Methanomicrobiales predict adaptation to contrasting environments
    Browne, Patrick
    Tamaki, Hideyuki
    Kyrpides, Nikos
    Woyke, Tanja
    Goodwin, Lynne
    Imachi, Hiroyuki
    Brauer, Suzanna
    Yavitt, Joseph B.
    Liu, Wen-Tso
    Zinder, Stephen
    Cadillo-Quiroz, Hinsby
    [J]. ISME JOURNAL, 2017, 11 (01) : 87 - 99
  • [10] Degradation of mecoprop in polluted landfill leachate and waste water in a moving bed biofilm reactor
    Casas, Monica Escola
    Nielsen, Tue Kjaergaard
    Kot, Witold
    Hansen, Lars Hestbjerg
    Johansen, Anders
    Bester, Kai
    [J]. WATER RESEARCH, 2017, 121 : 213 - 220