PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data

被引:118
作者
Anslan, Sten [1 ]
Bahram, Mohammad [1 ,2 ]
Hiiesalu, Indrek [1 ]
Tedersoo, Leho [3 ]
机构
[1] Univ Tartu, Inst Ecol & Earth Sci, Tartu, Estonia
[2] Uppsala Univ, Dept Organismal Biol, Evolutionary Biol Ctr, Uppsala, Sweden
[3] Univ Tartu, Nat Hist Museum, Tartu, Estonia
关键词
high-throughput sequencing; metabarcoding; pipeline; sequencing data analysis; software; SUBUNIT RIBOSOMAL-RNA; MOLECULAR-IDENTIFICATION; HYPERVARIABLE REGIONS; PIPELINE; COMMUNITIES; DIVERSITY; SOFTWARE; READS; FUNGI;
D O I
10.1111/1755-0998.12692
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable.
引用
收藏
页码:e234 / e240
页数:7
相关论文
共 47 条
  • [31] A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis
    Dillies, Marie-Agnes
    Rau, Andrea
    Aubert, Julie
    Hennequet-Antier, Christelle
    Jeanmougin, Marine
    Servant, Nicolas
    Keime, Celine
    Marot, Guillemette
    Castel, David
    Estelle, Jordi
    Guernec, Gregory
    Jagla, Bernd
    Jouneau, Luc
    Laloe, Denis
    Le Gall, Caroline
    Schaeffer, Brigitte
    Le Crom, Stephane
    Guedj, Mickael
    Jaffrezic, Florence
    BRIEFINGS IN BIOINFORMATICS, 2013, 14 (06) : 671 - 683
  • [32] CoLIde A bioinformatics tool for CO-expression based small RNA Loci Identification using high-throughput sequencing data
    Mohorianu, Irina
    Stocks, Matthew Benedict
    Wood, John
    Dalmay, Tamas
    Moulton, Vincent
    RNA BIOLOGY, 2013, 10 (07) : 1221 - 1230
  • [33] Sample Preservation, DNA or RNA Extraction and Data Analysis for High-Throughput Phytoplankton Community Sequencing
    Maki, Anita
    Salmi, Pauliina
    Mikkonen, Anu
    Kremp, Anke
    Tiirola, Marja
    FRONTIERS IN MICROBIOLOGY, 2017, 8
  • [34] The intervention effect of Aitongxiao prescription on primary liver cancer rats was evaluated based on high-throughput miRNA sequencing and bioinformatics analysis
    Xu, Lijing
    Cheng, Jinlai
    Li, Zhuoxian
    Wen, Xiaoyu
    Sun, Yuhao
    Xia, Meng
    Leng, Jing
    FRONTIERS IN ONCOLOGY, 2023, 13
  • [35] Identification of Infectious Agents in High-Throughput Sequencing Data Sets Is Easily Achievable Using Free, Cloud-Based Bioinformatics Platforms
    Chappell, Joseph G.
    Byaruhanga, Timothy
    Tsoleridis, Theocharis
    Ball, Jonathan K.
    McClure, C. Patrick
    JOURNAL OF CLINICAL MICROBIOLOGY, 2019, 57 (12)
  • [36] GenESysV: a fast, intuitive and scalable genome exploration open source tool for variants generated from high-throughput sequencing projects
    Zia, Mohammad
    Spurgeon, Paul
    Levesque, Adrian
    Furlani, Thomas
    Wang, Jianxin
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [37] GenESysV: a fast, intuitive and scalable genome exploration open source tool for variants generated from high-throughput sequencing projects
    Mohammad Zia
    Paul Spurgeon
    Adrian Levesque
    Thomas Furlani
    Jianxin Wang
    BMC Bioinformatics, 20
  • [38] eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing
    Yuan, Tiezheng
    Huang, Xiaoyi
    Dittmar, Rachel L.
    Du, Meijun
    Kohli, Manish
    Boardman, Lisa
    Thibodeau, Stephen N.
    Wang, Liang
    BMC GENOMICS, 2014, 15
  • [39] eccDNA-pipe: an integrated pipeline for identification, analysis and visualization of extrachromosomal circular DNA from high-throughput sequencing data
    Fang, Minghao
    Fang, Jingwen
    Luo, Songwen
    Liu, Ke
    Yu, Qiaoni
    Yang, Jiaxuan
    Zhou, Youyang
    Li, Zongkai
    Sun, Ruoming
    Guo, Chuang
    Qu, Kun
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (02)
  • [40] Optimising high-throughput sequencing data analysis, from gene database selection to the analysis of compositional data: a case study on tropical soil nematodes
    Wang, Simin
    Schneider, Dominik
    Hartke, Tamara R.
    Ballauff, Johannes
    Moura, Carina Carneiro de Melo
    Schulz, Garvin
    Li, Zhipeng
    Polle, Andrea
    Daniel, Rolf
    Gailing, Oliver
    Irawan, Bambang
    Scheu, Stefan
    Krashevska, Valentyna
    FRONTIERS IN ECOLOGY AND EVOLUTION, 2024, 12