Swarm v2: highly-scalable and high-resolution amplicon clustering

被引:401
作者
Mahe, Frederic [1 ]
Rognes, Torbjorn [2 ,3 ]
Quince, Christopher [4 ]
de Vargas, Colomban [5 ,6 ]
Dunthorn, Micah [1 ]
机构
[1] Tech Univ Kaiserslautern, Dept Ecol, Kaiserslautern, Germany
[2] Univ Oslo, Dept Informat, N-0316 Oslo, Norway
[3] Natl Hosp Norway, Oslo Univ Hosp, Dept Microbiol, Oslo, Norway
[4] Univ Warwick, Warwick Med Sch, Warwick, England
[5] CNRS, Stn Biol Roscoff, EPEP Evolut Protistes & Ecosyst Pelag, UMR 7144, Roscoff, France
[6] Univ Paris 06, Sorbonne Univ, Stn Biol Roscoff UMR7144, Roscoff, France
来源
PEERJ | 2015年 / 3卷
基金
英国工程与自然科学研究理事会;
关键词
Environmental diversity; Barcoding; Molecular operational taxonomic units; OPERATIONAL TAXONOMIC UNITS; CILIATE ENVIRONMENTAL DIVERSITY; SEQUENCING DATA; RARE BIOSPHERE; COMMUNITIES; WRINKLES; ACCURATE; REGIONS;
D O I
10.7717/peerj.1420
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs), free of arbitrary global clustering thresholds and input-order dependency. Swarm v1 worked with an initial phase that used iterative single-linkage with a local clustering threshold (d), followed by a phase that used the internal abundance structures of clusters to break chained OTUs. Here we present Swarm v2, which has two important novel features: (1) a new algorithm for d = 1 that allows the computation time of the program to scale linearly with increasing amounts of data; and (2) the new fastidious option that reduces under-grouping by grafting low abundant OTUs (e.g., singletons and doubletons) onto larger ones. Swarm v2 also directly integrates the clustering and breaking phases, dereplicates sequencing reads with d = 0, outputs OTU representatives in fasta format, and plots individual OTUs as two-dimensional networks.
引用
收藏
页数:12
相关论文
共 24 条
  • [11] The Earth Microbiome project: successes and aspirations
    Gilbert, Jack A.
    Jansson, Janet K.
    Knight, Rob
    [J]. BMC BIOLOGY, 2014, 12
  • [12] Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
    Goecks, Jeremy
    Nekrutenko, Anton
    Taylor, James
    [J]. GENOME BIOLOGY, 2010, 11 (08):
  • [13] The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy
    Guillou, Laure
    Bachar, Dipankar
    Audic, Stephane
    Bass, David
    Berney, Cedric
    Bittner, Lucie
    Boutte, Christophe
    Burgaud, Gaetan
    de Vargas, Colomban
    Decelle, Johan
    del Campo, Javier
    Dolan, John R.
    Dunthorn, Micah
    Edvardsen, Bente
    Holzmann, Maria
    Kooistra, Wiebe H. C. F.
    Lara, Enrique
    Le Bescot, Noan
    Logares, Ramiro
    Mahe, Frederic
    Massana, Ramon
    Montresor, Marina
    Morard, Raphael
    Not, Fabrice
    Pawlowski, Jan
    Probert, Ian
    Sauvadet, Anne-Laure
    Siano, Raffaele
    Stoeck, Thorsten
    Vaulot, Daniel
    Zimmermann, Pascal
    Christen, Richard
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D597 - D604
  • [14] Hartmann M, 2012, ISME J, V6, P2199, DOI 10.1038/ismej.2012.84
  • [15] Ironing out the wrinkles in the rare biosphere through improved OTU clustering
    Huse, Susan M.
    Welch, David Mark
    Morrison, Hilary G.
    Sogin, Mitchell L.
    [J]. ENVIRONMENTAL MICROBIOLOGY, 2010, 12 (07) : 1889 - 1898
  • [16] Surprisingly extensive mixed phylogenetic and ecological signals among bacterial Operational Taxonomic Units
    Koeppel, Alexander F.
    Wu, Martin
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (10) : 5175 - 5188
  • [17] Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates
    Kunin, Victor
    Engelbrektson, Anna
    Ochman, Howard
    Hugenholtz, Philip
    [J]. ENVIRONMENTAL MICROBIOLOGY, 2010, 12 (01) : 118 - 123
  • [18] Determinants of community structure in the global plankton interactome
    Lima-Mendez, Gipsi
    Faust, Karoline
    Henry, Nicolas
    Decelle, Johan
    Colin, Sebastien
    Carcillo, Fabrizio
    Chaffron, Samuel
    Ignacio-Espinosa, J. Cesar
    Roux, Simon
    Vincent, Flora
    Bittner, Lucie
    Darzi, Youssef
    Wang, Jun
    Audic, Stephane
    Berline, Leo
    Bontempi, Gianluca
    Cabello, Ana M.
    Coppola, Laurent
    Cornejo-Castillo, Francisco M.
    d'Ovidio, Francesco
    De Meester, Luc
    Ferrera, Isabel
    Garet-Delmas, Marie-Jose
    Guidi, Lionel
    Lara, Elena
    Pesant, Stephane
    Royo-Llonch, Marta
    Salazar, Guillem
    Sanchez, Pablo
    Sebastian, Marta
    Souffreau, Caroline
    Dimier, Celine
    Picheral, Marc
    Searson, Sarah
    Kandels-Lewis, Stefanie
    Gorsky, Gabriel
    Not, Fabrice
    Ogata, Hiroyuki
    Speich, Sabrina
    Stemmann, Lars
    Weissenbach, Jean
    Wincker, Patrick
    Acinas, Silvia G.
    Sunagawa, Shinichi
    Bork, Peer
    Sullivan, Matthew B.
    Karsenti, Eric
    Bowler, Chris
    de Vargas, Colomban
    Raes, Jeroen
    [J]. SCIENCE, 2015, 348 (6237)
  • [19] Patterns of Rare and Abundant Marine Microbial Eukaryotes
    Logares, Ramiro
    Audic, Stephane
    Bass, David
    Bittner, Lucie
    Boutte, Christophe
    Christen, Richard
    Claverie, Jean-Michel
    Decelle, Johan
    Dolan, John R.
    Dunthorn, Micah
    Edvardsen, Bente
    Gobet, Angelique
    Kooistra, Wiebe H. C. F.
    Mahe, Frederic
    Not, Fabrice
    Ogata, Hiroyuki
    Pawlowski, Jan
    Pernice, Massimo C.
    Romac, Sarah
    Shalchian-Tabrizi, Kamran
    Simon, Nathalie
    Stoeck, Thorsten
    Santini, Sebastien
    Siano, Raffaele
    Wincker, Patrick
    Zingone, Adriana
    Richards, Thomas A.
    de Vargas, Colomban
    Massana, Ramon
    [J]. CURRENT BIOLOGY, 2014, 24 (08) : 813 - 821
  • [20] Comparing High-throughput Platforms for Sequencing the V4 Region of SSU-rDNA in Environmental Microbial Eukaryotic Diversity Surveys
    Mahe, Frederic
    Mayor, Jordan
    Bunge, John
    Chi, Jingyun
    Siemensmeyer, Tobias
    Stoeck, Thorsten
    Wahl, Benjamin
    Paprotka, Tobias
    Filker, Sabine
    Dunthorn, Micah
    [J]. JOURNAL OF EUKARYOTIC MICROBIOLOGY, 2015, 62 (03) : 338 - 345