Guidelines to Statistical Analysis of Microbial Composition Data Inferred from Metagenomic Sequencing

被引:11
作者
Odintsova, Vera [1 ]
Tyakht, Alexander [1 ,2 ]
Alexeev, Dmitry [1 ,2 ]
机构
[1] Fed Res & Clin Ctr Phys Chem Med, Malaya Pirogovskaya 1a, Moscow, Russia
[2] Moscow Inst Phys & Technol, Inst Skiy Pereulok 9, Dolgoprudnyi, Russia
关键词
SAMPLE-SIZE; GUT MICROBIOTA; POWER;
D O I
10.21775/cimb.024.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Metagenomics, the application of high-throughput DNA sequencing for surveys of environmental samples, has revolutionized our view on the taxonomic and genetic composition of complex microbial communities. An enormous richness of microbiota keeps unfolding in the context of various fields ranging from biomedicine and food industry to geology. Primary analysis of metagenomic reads allows to infer semi-quantitative data describing the community structure. However, such compositional data possess statistical specific properties that are important to consider during preprocessing, hypothesis testing and interpreting the results of statistical tests. Failure to account for these specifics may lead to essentially wrong conclusions as a result of the survey. Here we present a researcher introduction to the field of metagenomics with the basic properties of microbial compositional data including statistical power and proposed distribution models, perform a review of the publicly available software tools developed specifically for such data and outline the recommendations for the application of the methods.
引用
收藏
页码:17 / 36
页数:20
相关论文
共 30 条
  • [1] PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing?
    Anderson, Marti J.
    Walsh, Daniel C. I.
    [J]. ECOLOGICAL MONOGRAPHS, 2013, 83 (04) : 557 - 574
  • [2] Statisticians issue warning on P values
    Baker, Monya
    [J]. NATURE, 2016, 531 (7593) : 151 - 151
  • [3] Power failure: why small sample size undermines the reliability of neuroscience
    Button, Katherine S.
    Ioannidis, John P. A.
    Mokrysz, Claire
    Nosek, Brian A.
    Flint, Jonathan
    Robinson, Emma S. J.
    Munafo, Marcus R.
    [J]. NATURE REVIEWS NEUROSCIENCE, 2013, 14 (05) : 365 - 376
  • [4] Gut microbiota and diet in patients with different glucose tolerance
    Egshatyan, Lilit
    Kashtanova, Daria
    Popenko, Anna
    Tkacheva, Olga
    Tyakht, Alexander
    Alexeev, Dmitry
    Karamnova, Natalia
    Kostryukova, Elena
    Babenko, Vladislav
    Vakhitova, Maria
    Boytsov, Sergey
    [J]. ENDOCRINE CONNECTIONS, 2016, 5 (01): : 1 - 9
  • [5] Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis
    Fernandes, Andrew D.
    Reid, Jennifer N. S.
    Macklaim, Jean M.
    McMurrough, Thomas A.
    Edgell, David R.
    Gloor, Gregory B.
    [J]. MICROBIOME, 2014, 2
  • [6] Conducting a Microbiome Study
    Goodrich, Julia K.
    Di Rienzi, Sara C.
    Poole, Angela C.
    Koren, Omry
    Walters, William A.
    Caporaso, J. Gregory
    Knight, Rob
    Ley, Ruth E.
    [J]. CELL, 2014, 158 (02) : 250 - 262
  • [7] Hair J.F., 2006, A primer on partial least squares structural equation modeling (PLS-SEM), V6th ed.
  • [8] Structure, function and diversity of the healthy human microbiome
    Huttenhower, Curtis
    Gevers, Dirk
    Knight, Rob
    Abubucker, Sahar
    Badger, Jonathan H.
    Chinwalla, Asif T.
    Creasy, Heather H.
    Earl, Ashlee M.
    FitzGerald, Michael G.
    Fulton, Robert S.
    Giglio, Michelle G.
    Hallsworth-Pepin, Kymberlie
    Lobos, Elizabeth A.
    Madupu, Ramana
    Magrini, Vincent
    Martin, John C.
    Mitreva, Makedonka
    Muzny, Donna M.
    Sodergren, Erica J.
    Versalovic, James
    Wollam, Aye M.
    Worley, Kim C.
    Wortman, Jennifer R.
    Young, Sarah K.
    Zeng, Qiandong
    Aagaard, Kjersti M.
    Abolude, Olukemi O.
    Allen-Vercoe, Emma
    Alm, Eric J.
    Alvarado, Lucia
    Andersen, Gary L.
    Anderson, Scott
    Appelbaum, Elizabeth
    Arachchi, Harindra M.
    Armitage, Gary
    Arze, Cesar A.
    Ayvaz, Tulin
    Baker, Carl C.
    Begg, Lisa
    Belachew, Tsegahiwot
    Bhonagiri, Veena
    Bihan, Monika
    Blaser, Martin J.
    Bloom, Toby
    Bonazzi, Vivien
    Brooks, J. Paul
    Buck, Gregory A.
    Buhay, Christian J.
    Busam, Dana A.
    Campbell, Joseph L.
    [J]. NATURE, 2012, 486 (7402) : 207 - 214
  • [9] Statistical evaluation of methods for identification of differentially abundant genes in comparative metagenomics
    Jonsson, Viktor
    Osterlund, Tobias
    Nerman, Olle
    Kristiansson, Erik
    [J]. BMC GENOMICS, 2016, 17
  • [10] Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA
    Kelly, Brendan J.
    Gross, Robert
    Bittinger, Kyle
    Sherrill-Mix, Scott
    Lewis, James D.
    Collman, Ronald G.
    Bushman, Frederic D.
    Li, Hongzhe
    [J]. BIOINFORMATICS, 2015, 31 (15) : 2461 - 2468