PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data

被引：118

作者：

Anslan, Sten ^{[1
]}

Bahram, Mohammad ^{[1
,2
]}

Hiiesalu, Indrek ^{[1
]}

Tedersoo, Leho ^{[3
]}

机构：

[1] Univ Tartu, Inst Ecol & Earth Sci, Tartu, Estonia

[2] Uppsala Univ, Dept Organismal Biol, Evolutionary Biol Ctr, Uppsala, Sweden

[3] Univ Tartu, Nat Hist Museum, Tartu, Estonia

来源：

MOLECULAR ECOLOGY RESOURCES | 2017年 / 17卷 / 06期

关键词：

high-throughput sequencing; metabarcoding; pipeline; sequencing data analysis; software; SUBUNIT RIBOSOMAL-RNA; MOLECULAR-IDENTIFICATION; HYPERVARIABLE REGIONS; PIPELINE; COMMUNITIES; DIVERSITY; SOFTWARE; READS; FUNGI;

D O I：

10.1111/1755-0998.12692

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable.

引用

页码：e234 / e240

页数：7

共 47 条

[31] A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis
Dillies, Marie-Agnes
Rau, Andrea
Aubert, Julie
Hennequet-Antier, Christelle
Jeanmougin, Marine
Servant, Nicolas
Keime, Celine
Marot, Guillemette
Castel, David
Estelle, Jordi
Guernec, Gregory
Jagla, Bernd
Jouneau, Luc
Laloe, Denis
Le Gall, Caroline
Schaeffer, Brigitte
Le Crom, Stephane
Guedj, Mickael
Jaffrezic, Florence
BRIEFINGS IN BIOINFORMATICS, 2013, 14 (06) : 671 - 683
[32] CoLIde A bioinformatics tool for CO-expression based small RNA Loci Identification using high-throughput sequencing data
Mohorianu, Irina
Stocks, Matthew Benedict
Wood, John
Dalmay, Tamas
Moulton, Vincent
RNA BIOLOGY, 2013, 10 (07) : 1221 - 1230
[33] Sample Preservation, DNA or RNA Extraction and Data Analysis for High-Throughput Phytoplankton Community Sequencing
Maki, Anita
Salmi, Pauliina
Mikkonen, Anu
Kremp, Anke
Tiirola, Marja
FRONTIERS IN MICROBIOLOGY, 2017, 8
[34] The intervention effect of Aitongxiao prescription on primary liver cancer rats was evaluated based on high-throughput miRNA sequencing and bioinformatics analysis
Xu, Lijing
Cheng, Jinlai
Li, Zhuoxian
Wen, Xiaoyu
Sun, Yuhao
Xia, Meng
Leng, Jing
FRONTIERS IN ONCOLOGY, 2023, 13
[35] Identification of Infectious Agents in High-Throughput Sequencing Data Sets Is Easily Achievable Using Free, Cloud-Based Bioinformatics Platforms
Chappell, Joseph G.
Byaruhanga, Timothy
Tsoleridis, Theocharis
Ball, Jonathan K.
McClure, C. Patrick
JOURNAL OF CLINICAL MICROBIOLOGY, 2019, 57 (12)
[36] GenESysV: a fast, intuitive and scalable genome exploration open source tool for variants generated from high-throughput sequencing projects
Zia, Mohammad
Spurgeon, Paul
Levesque, Adrian
Furlani, Thomas
Wang, Jianxin
BMC BIOINFORMATICS, 2019, 20 (1)
[37] GenESysV: a fast, intuitive and scalable genome exploration open source tool for variants generated from high-throughput sequencing projects
Mohammad Zia
Paul Spurgeon
Adrian Levesque
Thomas Furlani
Jianxin Wang
BMC Bioinformatics, 20
[38] eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing
Yuan, Tiezheng
Huang, Xiaoyi
Dittmar, Rachel L.
Du, Meijun
Kohli, Manish
Boardman, Lisa
Thibodeau, Stephen N.
Wang, Liang
BMC GENOMICS, 2014, 15
[39] eccDNA-pipe: an integrated pipeline for identification, analysis and visualization of extrachromosomal circular DNA from high-throughput sequencing data
Fang, Minghao
Fang, Jingwen
Luo, Songwen
Liu, Ke
Yu, Qiaoni
Yang, Jiaxuan
Zhou, Youyang
Li, Zongkai
Sun, Ruoming
Guo, Chuang
Qu, Kun
BRIEFINGS IN BIOINFORMATICS, 2024, 25 (02)
[40] Optimising high-throughput sequencing data analysis, from gene database selection to the analysis of compositional data: a case study on tropical soil nematodes
Wang, Simin
Schneider, Dominik
Hartke, Tamara R.
Ballauff, Johannes
Moura, Carina Carneiro de Melo
Schulz, Garvin
Li, Zhipeng
Polle, Andrea
Daniel, Rolf
Gailing, Oliver
Irawan, Bambang
Scheu, Stefan
Krashevska, Valentyna
FRONTIERS IN ECOLOGY AND EVOLUTION, 2024, 12

← 1 2 3 4 5 →