StreamingTrim 1.0: a Java']Java software for dynamic trimming of 16S rRNA sequence data from metagenetic studies

被引:27
作者
Bacci, G. [1 ,2 ]
Bazzicalupo, M. [1 ]
Benedetti, A. [2 ]
Mengoni, A. [1 ]
机构
[1] Univ Florence, Dept Biol, I-50019 Florence, Italy
[2] Ctr Ric Studio Relaz Tra Pianta & Suolo CRA RPS, Consiglio Ric Sperimentaz Agr, I-00184 Rome, Italy
关键词
amplicon libraries; next-generation sequencing; dynamic trimming; metagenetics; DEEP-SEA; MICROBIAL EUKARYOTES; COMMUNITY; BIOSPHERE; QUALITY;
D O I
10.1111/1755-0998.12187
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation sequencing technologies are extensively used in the field of molecular microbial ecology to describe taxonomic composition and to infer functionality of microbial communities. In particular, the so-called barcode or metagenetic applications that are based on PCR amplicon library sequencing are very popular at present. One of the problems, related to the utilization of the data of these libraries, is the analysis of reads quality and removal (trimming) of low-quality segments, while retaining sufficient information for subsequent analyses (e.g. taxonomic assignment). Here, we present StreamingTrim, a DNA reads trimming software, written in Java, with which researchers are able to analyse the quality of DNA sequences in fastq files and to search for low-quality zones in a very conservative way. This software has been developed with the aim to provide a tool capable of trimming amplicon library data, retaining as much as taxonomic information as possible. This software is equipped with a graphical user interface for a user-friendly usage. Moreover, from a computational point of view, StreamingTrim reads and analyses sequences one by one from an input fastq file, without keeping anything in memory, permitting to run the computation on a normal desktop PC or even a laptop. Trimmed sequences are saved in an output file, and a statistics summary is displayed that contains the mean and standard deviation of the length and quality of the whole sequence file. Compiled software, a manual and example data sets are available under the BSD-2-Clause License at the GitHub repository at .
引用
收藏
页码:426 / 434
页数:9
相关论文
共 22 条
[1]   Comparative Analysis of Human Gut Microbiota by Barcoded Pyrosequencing [J].
Andersson, Anders F. ;
Lindberg, Mathilda ;
Jakobsson, Hedvig ;
Backhed, Fredrik ;
Nyren, Pal ;
Engstrand, Lars .
PLOS ONE, 2008, 3 (07)
[2]   Metagenetic community analysis of microbial eukaryotes illuminates biogeographic patterns in deep-sea and shallow water sediments [J].
Bik, Holly M. ;
Sung, Way ;
De Ley, Paul ;
Baldwin, James G. ;
Sharma, Jyotsna ;
Rocha-Olivares, Axayacatl ;
Thomas, W. Kelley .
MOLECULAR ECOLOGY, 2012, 21 (05) :1048-1059
[3]   SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data [J].
Cox, Murray P. ;
Peterson, Daniel A. ;
Biggs, Patrick J. .
BMC BIOINFORMATICS, 2010, 11
[4]   Ultrasequencing of the meiofaunal biosphere: practice, pitfalls and promises [J].
Creer, S. ;
Fonseca, V. G. ;
Porazinska, D. L. ;
Giblin-Davis, R. M. ;
Sung, W. ;
Power, D. M. ;
Packer, M. ;
Carvalho, G. R. ;
Blaxter, M. L. ;
Lambshead, P. J. D. ;
Thomas, W. K. .
MOLECULAR ECOLOGY, 2010, 19 :4-20
[5]   Cosmopolitanism of microbial eukaryotes in the global deep seas [J].
Creer, Simon ;
Sinniger, Frederic .
MOLECULAR ECOLOGY, 2012, 21 (05) :1033-1035
[6]   Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons [J].
Haas, Brian J. ;
Gevers, Dirk ;
Earl, Ashlee M. ;
Feldgarden, Mike ;
Ward, Doyle V. ;
Giannoukos, Georgia ;
Ciulla, Dawn ;
Tabbaa, Diana ;
Highlander, Sarah K. ;
Sodergren, Erica ;
Methe, Barbara ;
DeSantis, Todd Z. ;
Petrosino, Joseph F. ;
Knight, Rob ;
Birren, Bruce W. .
GENOME RESEARCH, 2011, 21 (03) :494-504
[7]   Accuracy and quality of massively parallel DNA pyrosequencing [J].
Huse, Susan M. ;
Huber, Julie A. ;
Morrison, Hilary G. ;
Sogin, Mitchell L. ;
Mark Welch, David .
GENOME BIOLOGY, 2007, 8 (07)
[8]   Ironing out the wrinkles in the rare biosphere through improved OTU clustering [J].
Huse, Susan M. ;
Welch, David Mark ;
Morrison, Hilary G. ;
Sogin, Mitchell L. .
ENVIRONMENTAL MICROBIOLOGY, 2010, 12 (07) :1889-1898
[9]  
Jones M, 2011, PLOS ONE, V6
[10]   Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments [J].
Lecroq, Beatrice ;
Lejzerowicz, Franck ;
Bachar, Dipankar ;
Christen, Richard ;
Esling, Philippe ;
Baerlocher, Loic ;
Osteras, Magne ;
Farinelli, Laurent ;
Pawlowski, Jan .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (32) :13177-13182