systemPipeR: NGS workflow and report generation environment

被引:124
作者
Backman, Tyler W. H. [1 ]
Girke, Thomas [1 ]
机构
[1] Univ Calif Riverside, Inst Integrat Genome Biol, 1207F Genom Bldg,3401 Watkins Dr, Riverside, CA 92521 USA
基金
美国国家卫生研究院; 美国食品与农业研究所; 美国国家科学基金会;
关键词
Analysis workflow; Next Generation Sequencing (NGS); Ribo-Seq; ChIP-Seq; RNA-Seq; VAR-Seq; BIOCONDUCTOR PACKAGE; SEQUENCE-ANALYSIS; EXPLORATION; ANNOTATION; GENOMICS; GALAXY; CLOUD;
D O I
10.1186/s12859-016-1241-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology and medicine. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology, as it requires complex multi-step processing of big data demanding considerable computational expertise from users. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components within the widely used R/Bioconductor environment into automated workflows capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner. Results: To address this need, we have developed the R/Bioconductor package systemPipeR. It is an extensible environment for both building and running end-to-end analysis workflows with automated report generation for a wide range of NGS applications. Its unique features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software on local computers and computer clusters. A flexible sample annotation infrastructure efficiently handles complex sample sets and experimental designs. To simplify the analysis of widely used NGS applications, the package provides pre-configured workflows and reporting templates for RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Additional workflow templates will be provided in the future. Conclusions: systemPipeR accelerates the extraction of reproducible analysis results from NGS experiments. By combining the capabilities of many R/Bioconductor and command- line tools, it makes efficient use of existing software resources without limiting the user to a set of predefined methods or environments. systemPipeR is freely available for all common operating systems from Bioconductor (http://bioconductor.org/packages/devel/systemPipeR).
引用
收藏
页数:8
相关论文
共 41 条
[1]   Harnessing cloud computing with Galaxy Cloud [J].
Afgan, Enis ;
Baker, Dannon ;
Coraor, Nate ;
Goto, Hiroki ;
Paul, Ian M. ;
Makova, Kateryna D. ;
Nekrutenko, Anton ;
Taylor, James .
NATURE BIOTECHNOLOGY, 2011, 29 (11) :972-974
[2]  
Akalin A, 2012, GENOME BIOL, V13, DOI [10.1186/gb-2012-13-10-R87, 10.1186/gb-2012-13-10-r87]
[3]  
[Anonymous], ALIGNING SEQUENCE RE, DOI DOI 10.48550/ARXIV.1303.3997
[4]  
Bischl B, 2015, J STAT SOFTW, V64, P1
[5]   RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application [J].
D'Antonio, Mattia ;
De Meo, Paolo D'Onorio ;
Pallocca, Matteo ;
Picardi, Ernesto ;
D'Erchia, Anna Maria ;
Calogero, Raffaele A. ;
Castrignano, Tiziana ;
Pesole, Graziano .
BMC GENOMICS, 2015, 16
[6]   BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis [J].
Durinck, S ;
Moreau, Y ;
Kasprzyk, A ;
Davis, S ;
De Moor, B ;
Brazma, A ;
Huber, W .
BIOINFORMATICS, 2005, 21 (16) :3439-3440
[7]   MultiQC: summarize analysis results for multiple tools and samples in a single report [J].
Ewels, Philip ;
Magnusson, Mans ;
Lundin, Sverker ;
Kaller, Max .
BIOINFORMATICS, 2016, 32 (19) :3047-3048
[8]   QuasR: quantification and annotation of short reads in R [J].
Gaidatzis, Dimos ;
Lerch, Anita ;
Hahne, Florian ;
Stadler, Michael B. .
BIOINFORMATICS, 2015, 31 (07) :1130-1132
[9]   Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences [J].
Goecks, Jeremy ;
Nekrutenko, Anton ;
Taylor, James .
GENOME BIOLOGY, 2010, 11 (08)
[10]   Ruffus: a lightweight Python']Python library for computational pipelines [J].
Goodstadt, Leo .
BIOINFORMATICS, 2010, 26 (21) :2778-2779