Richness estimation in microbiome data obtained from denoising pipelines

被引:40
作者
Bardenhorst, Sven Kleine [1 ]
Vital, Marius [2 ]
Karch, Andre [1 ]
Rubsamen, Nicole [1 ]
机构
[1] Univ Munster, Inst Epidemiol & Social Med, Munster, Germany
[2] Hannover Med Sch, Inst Med Microbiol & Hosp Hyg, Hannover, Germany
关键词
Microbiome; Rarefaction; Denoising; Sequencing depth; Sub-sampling;
D O I
10.1016/j.csbj.2021.12.036
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The quantification of richness within a sample-either measured as the number of observed species or approximated by estimation-is a common first step in microbiome studies and is known to be highly dependent on sequencing depth, which itself is highly variable between samples. Rarefaction curves serve as a tool to investigate this dependency and it is often argued that after rarefying data-sub-sampling to an equal sequencing depth-richness estimates no longer depend on sequencing depth. However, the estimation of richness from data obtained by high throughput sequencing methods and processed by current bioinformatics pipelines may be subject to various sources of variation related to sequencing depth. Those that may confound inference based on richness estimates and cannot be solved by sub-sampling. We investigated how pipeline settings in DADA2 and deblur affect estimates of richness and showed that the use of rarefaction and sub-sampling is inappropriate when default pipeline settings are applied. Furthermore, we showed how independent sample-wise processing established spurious correlations between sequencing depth and richness estimations in data produced by DADA2 and how this problem can be solved by a pooled processing approach. (C) 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.
引用
收藏
页码:508 / 520
页数:13
相关论文
共 16 条
[1]   Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns [J].
Amir, Amnon ;
McDonald, Daniel ;
Navas-Molina, Jose A. ;
Kopylova, Evguenia ;
Morton, James T. ;
Xu, Zhenjiang Zech ;
Kightley, Eric P. ;
Thompson, Luke R. ;
Hyde, Embriette R. ;
Gonzalez, Antonio ;
Knight, Rob .
MSYSTEMS, 2017, 2 (02)
[2]  
Bittinger Kyle, 2020, CRAN, DOI 10.32614/CRAN.package.abdiv
[3]   Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 [J].
Bolyen, Evan ;
Rideout, Jai Ram ;
Dillon, Matthew R. ;
Bokulich, NicholasA. ;
Abnet, Christian C. ;
Al-Ghalith, Gabriel A. ;
Alexander, Harriet ;
Alm, Eric J. ;
Arumugam, Manimozhiyan ;
Asnicar, Francesco ;
Bai, Yang ;
Bisanz, Jordan E. ;
Bittinger, Kyle ;
Brejnrod, Asker ;
Brislawn, Colin J. ;
Brown, C. Titus ;
Callahan, Benjamin J. ;
Caraballo-Rodriguez, Andres Mauricio ;
Chase, John ;
Cope, Emily K. ;
Da Silva, Ricardo ;
Diener, Christian ;
Dorrestein, Pieter C. ;
Douglas, Gavin M. ;
Durall, Daniel M. ;
Duvallet, Claire ;
Edwardson, Christian F. ;
Ernst, Madeleine ;
Estaki, Mehrbod ;
Fouquier, Jennifer ;
Gauglitz, Julia M. ;
Gibbons, Sean M. ;
Gibson, Deanna L. ;
Gonzalez, Antonio ;
Gorlick, Kestrel ;
Guo, Jiarong ;
Hillmann, Benjamin ;
Holmes, Susan ;
Holste, Hannes ;
Huttenhower, Curtis ;
Huttley, Gavin A. ;
Janssen, Stefan ;
Jarmusch, Alan K. ;
Jiang, Lingjing ;
Kaehler, Benjamin D. ;
Bin Kang, Kyo ;
Keefe, Christopher R. ;
Keim, Paul ;
Kelley, Scott T. ;
Knights, Dan .
NATURE BIOTECHNOLOGY, 2019, 37 (08) :852-857
[4]  
Callahan BJ, 2016, NAT METHODS, V13, P581, DOI [10.1038/NMETH.3869, 10.1038/nmeth.3869]
[5]  
CHAO A, 1984, SCAND J STAT, V11, P265
[6]   Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies [J].
Chao, Anne ;
Gotelli, Nicholas J. ;
Hsieh, T. C. ;
Sander, Elizabeth L. ;
Ma, K. H. ;
Colwell, Robert K. ;
Ellison, Aaron M. .
ECOLOGICAL MONOGRAPHS, 2014, 84 (01) :45-67
[7]  
Chazdon RL, 1998, MAN BIOSPH, V20, P285
[8]   THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS [J].
GOOD, IJ .
BIOMETRIKA, 1953, 40 (3-4) :237-264
[9]  
Li D., 2018, J OPEN SOURCE SOFTW, V3, P1041, DOI DOI 10.21105/JOSS.01041
[10]   phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data [J].
McMurdie, Paul J. ;
Holmes, Susan .
PLOS ONE, 2013, 8 (04)