SimkaMin: fast and resource frugal de novo comparative metagenomics

被引:7
作者
Benoit, Gaetan [1 ]
Mariadassou, Mahendra [2 ]
Robin, Stephane [3 ]
Schbath, Sophie [2 ]
Peterlongo, Pierre [1 ]
Lemaitre, Claire [1 ]
机构
[1] Univ Rennes, IRISA, CNRS, INRIA, F-35000 Rennes, France
[2] Univ Paris Saclay, INRA, MaIAGE, F-78350 Jouy En Josas, France
[3] Univ Paris Saclay, INRA, AgroParisTech, UMR MIA Paris, F-75005 Paris, France
关键词
PLANKTON;
D O I
10.1093/bioinformatics/btz685
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: De novo comparative metagenomics is one of the most straightforward ways to analyze large sets of metagenomic data. Latest methods use the fraction of shared k-mers to estimate genomic similarity between read sets. However, those methods, while extremely efficient, are still limited by computational needs for practical usage outside of large computing facilities. Results: We present SimkaMin, a quick comparative metagenomics tool with low disk and memory footprints, thanks to an efficient data subsampling scheme used to estimate Bray-Curtis and Jaccard dissimilarities. One billion metagenomic reads can be analyzed in <3 min, with tiny memory (1.09 GB) and disk (approximate to 0.3 GB) requirements and without altering the quality of the downstream comparative analyses, making of SimkaMin a tool perfectly tailored for very large-scale metagenomic projects.
引用
收藏
页码:1275 / 1276
页数:2
相关论文
共 7 条
[1]   Multiple comparative metagenomics using multiset k-mer counting [J].
Benoit, Gaetan ;
Peterlongo, Pierre ;
Mariadassou, Mahendra ;
Drezen, Erwan ;
Schbath, Sophie ;
Lavenier, Dominique ;
Lemaitre, Claire .
PEERJ COMPUTER SCIENCE, 2016, 2016 (11)
[2]   Tara Oceans studies plankton at PLANETARY SCALE [J].
Bork, P. ;
Bowler, C. ;
de Vargas, C. ;
Gorsky, G. ;
Karsenti, E. ;
Wincker, P. .
SCIENCE, 2015, 348 (6237) :873-873
[3]   On the resemblance and containment of documents [J].
Broder, AZ .
COMPRESSION AND COMPLEXITY OF SEQUENCES 1997 - PROCEEDINGS, 1998, :21-29
[4]   GATB: Genome Assembly & Analysis Tool Box [J].
Drezen, Erwan ;
Rizk, Guillaume ;
Chikhi, Rayan ;
Deltel, Charles ;
Lemaitre, Claire ;
Peterlongo, Pierre ;
Lavenier, Dominique .
BIOINFORMATICS, 2014, 30 (20) :2959-2961
[5]   Strains, functions and dynamics in the expanded Human Microbiome Project [J].
Lloyd-Price, Jason ;
Mahurkar, Anup ;
Ahnavard, Gholamali R. ;
Rabtree, Jonathan C. ;
Rvis, Joshua O. ;
Hall, A. B. Rantley ;
Rady, Arthur B. ;
Reasy, Heather H. C. ;
McCracken, Carrie ;
Giglio, Michelle G. ;
McDonald, Daniel ;
Franzosa, Eric A. ;
Knight, Rob ;
White, Owen ;
Huttenhower, Curtis .
NATURE, 2017, 550 (7674) :61-+
[6]   Mash: fast genome and metagenome distance estimation using MinHash [J].
Ondov, Brian D. ;
Treangen, Todd J. ;
Melsted, Pall ;
Mallonee, Adam B. ;
Bergman, Nicholas H. ;
Koren, Sergey ;
Phillippy, Adam M. .
GENOME BIOLOGY, 2016, 17
[7]   Environmental characteristics of Agulhas rings affect interocean plankton transport [J].
Villar, Emilie ;
Farrant, Gregory K. ;
Follows, Michael ;
Garczarek, Laurence ;
Speich, Sabrina ;
Audic, Stephane ;
Bittner, Lucie ;
Blanke, Bruno ;
Brum, Jennifer R. ;
Brunet, Christophe ;
Casotti, Raffaella ;
Chase, Alison ;
Dolan, John R. ;
d'Ortenzio, Fabrizio ;
Gattuso, Jean-Pierre ;
Grima, Nicolas ;
Guidi, Lionel ;
Hill, Christopher N. ;
Jahn, Oliver ;
Jamet, Jean-Louis ;
Le Goff, Herve ;
Lepoivre, Cyrille ;
Malviya, Shruti ;
Pelletier, Eric ;
Romagnan, Jean-Baptiste ;
Roux, Simon ;
Santini, Sebastien ;
Scalco, Eleonora ;
Schwenck, Sarah M. ;
Tanaka, Atsuko ;
Testor, Pierre ;
Vannier, Thomas ;
Vincent, Flora ;
Zingone, Adriana ;
Dimier, Celine ;
Picheral, Marc ;
Searson, Sarah ;
Kandels-Lewis, Stefanie ;
Acinas, Silvia G. ;
Bork, Peer ;
Boss, Emmanuel ;
de Vargas, Colomban ;
Gorsky, Gabriel ;
Ogata, Hiroyuki ;
Pesant, Stephane ;
Sullivan, Matthew B. ;
Sunagawa, Shinichi ;
Wincker, Patrick ;
Karsenti, Eric ;
Bowler, Chris .
SCIENCE, 2015, 348 (6237)