Swarm v3: towards tera-scale amplicon clustering

被引:68
作者
Mahe, Frederic [1 ,2 ]
Czech, Lucas [3 ,4 ]
Stamatakis, Alexandros [3 ,5 ]
Quince, Christopher [6 ,7 ,8 ]
de Vargas, Colomban [9 ,10 ]
Dunthorn, Micah [11 ,12 ]
Rognes, Torbjorn [13 ,14 ]
机构
[1] CIRAD, UMR PHIM, Montpellier, France
[2] Univ Montpellier, PHIM Plant Hlth Inst, Inst Agro, INRAE,IRD,CIRAD, Montpellier, France
[3] Heidelberg Inst Theoret Studies, Computat Mol Evolut Grp, Heidelberg, Germany
[4] Carnegie Inst Sci, Dept Plant Biol, 290 Panama St, Stanford, CA 94305 USA
[5] Karlsruhe Inst Technol, Inst Theoret Informat, Karlsruhe, Germany
[6] Earlham Inst, Organisms & Ecosyst, Norwich, Norfolk, England
[7] Quadram Inst, Gut Microbes & Hlth, Norwich, Norfolk, England
[8] Univ Warwick, Warwick Med Sch, Coventry, W Midlands, England
[9] Sorbonne Univ, ECOMAP, UMR7144, Stn Biol Roscoff,CNRS, Roscoff, France
[10] Res Federat Study Global Ocean Syst Ecol & Evolut, FR2022 Tara GOSEE, Paris, France
[11] Univ Oslo, Nat Hist Museum, Oslo, Norway
[12] Univ Duisburg Essen, Eukaryot Microbiol, Essen, Germany
[13] Univ Oslo, Dept Informat, Oslo, Norway
[14] Oslo Univ Hosp, Rikshosp, Dept Microbiol, Oslo, Norway
基金
英国生物技术与生命科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btab493
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes. Results: When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic.
引用
收藏
页码:267 / 269
页数:3
相关论文
共 19 条
[1]   Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns [J].
Amir, Amnon ;
McDonald, Daniel ;
Navas-Molina, Jose A. ;
Kopylova, Evguenia ;
Morton, James T. ;
Xu, Zhenjiang Zech ;
Kightley, Eric P. ;
Thompson, Luke R. ;
Hyde, Embriette R. ;
Gonzalez, Antonio ;
Knight, Rob .
MSYSTEMS, 2017, 2 (02)
[2]   UniEuk: Time to Speak a Common Language in Protistology! [J].
Berney, Cedric ;
Ciuprina, Andreea ;
Bender, Sara ;
Brodie, Juliet ;
Edgcomb, Virginia ;
Kim, Eunsoo ;
Rajan, Jeena ;
Parfrey, Laura Wegener ;
Adl, Sina ;
Audic, Stephane ;
Bass, David ;
Caron, David A. ;
Cochrane, Guy ;
Czech, Lucas ;
Dunthorn, Micah ;
Geisen, Stefan ;
Gloeckner, Frank Oliver ;
Mahe, Frederic ;
Quast, Christian ;
Kaye, Jonathan Z. ;
Simpson, Alastair G. B. ;
Stamatakis, Alexandros ;
del Campo, Javier ;
Yilmaz, Pelin ;
de Vargas, Colomban .
JOURNAL OF EUKARYOTIC MICROBIOLOGY, 2017, 64 (03) :407-411
[3]  
Callahan BJ, 2016, NAT METHODS, V13, P581, DOI [10.1038/NMETH.3869, 10.1038/nmeth.3869]
[4]   The State of Software for Evolutionary Biology [J].
Darriba, Diego ;
Flouri, Tomas ;
Stamatakis, Alexandros .
MOLECULAR BIOLOGY AND EVOLUTION, 2018, 35 (05) :1037-1046
[5]   Eukaryotic plankton diversity in the sunlit ocean [J].
de Vargas, Colomban ;
Audic, Stephane ;
Henry, Nicolas ;
Decelle, Johan ;
Mahe, Frederic ;
Logares, Ramiro ;
Lara, Enrique ;
Berney, Cedric ;
Le Bescot, Noan ;
Probert, Ian ;
Carmichael, Margaux ;
Poulain, Julie ;
Romac, Sarah ;
Colin, Sebastien ;
Aury, Jean-Marc ;
Bittner, Lucie ;
Chaffron, Samuel ;
Dunthorn, Micah ;
Engelen, Stefan ;
Flegontova, Olga ;
Guidi, Lionel ;
Horak, Ales ;
Jaillon, Olivier ;
Lima-Mendez, Gipsi ;
Lukes, Julius ;
Malviya, Shruti ;
Morard, Raphael ;
Mulot, Matthieu ;
Scalco, Eleonora ;
Siano, Raffaele ;
Vincent, Flora ;
Zingone, Adriana ;
Dimier, Celine ;
Picheral, Marc ;
Searson, Sarah ;
Kandels-Lewis, Stefanie ;
Acinas, Silvia G. ;
Bork, Peer ;
Bowler, Chris ;
Gorsky, Gabriel ;
Grimsley, Nigel ;
Hingamp, Pascal ;
Iudicone, Daniele ;
Not, Fabrice ;
Ogata, Hiroyuki ;
Pesant, Stephane ;
Raes, Jeroen ;
Sieracki, Michael E. ;
Speich, Sabrina ;
Stemmann, Lars .
SCIENCE, 2015, 348 (6237)
[6]   Search and clustering orders of magnitude faster than BLAST [J].
Edgar, Robert C. .
BIOINFORMATICS, 2010, 26 (19) :2460-2461
[7]  
Forster D, 2020, EVALUATING GEOGRAPHI
[8]   Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates [J].
Froslev, Tobias Guldberg ;
Kjoller, Rasmus ;
Bruun, Hans Henrik ;
Ejrnaes, Rasmus ;
Brunbjerg, Ane Kirstine ;
Pietroni, Carlotta ;
Hansen, Anders Johannes .
NATURE COMMUNICATIONS, 2017, 8
[9]   Marked changes in diversity and relative activity of picoeukaryotes with depth in the world ocean [J].
Giner, Caterina R. ;
Pernice, Massimo C. ;
Balague, Vanessa ;
Duarte, Carlos M. ;
Gasol, Josep M. ;
Logares, Ramiro ;
Massana, Ramon .
ISME JOURNAL, 2020, 14 (02) :437-449
[10]  
Mahé F, 2017, NAT ECOL EVOL, V1, DOI [10.1038/s41559-017-00911, 10.1038/s41559-017-0091]