NGSANE: a lightweight production informatics framework for high-throughput data analysis

被引:18
作者
Buske, Fabian A. [1 ]
French, Hugh J. [1 ]
Smith, Martin A. [2 ,3 ]
Clark, Susan J. [1 ,3 ]
Bauer, Denis C. [4 ]
机构
[1] Univ NSW, Garvan Inst Med Res, Kinghorn Canc Ctr, Canc Epigenet Program,Canc Res Div, Sydney, NSW 2010, Australia
[2] Univ NSW, Garvan Inst Med Res, RNA Biol & Plast Lab, Sydney, NSW 2010, Australia
[3] Univ NSW, St Vincents Clin Sch, Sydney, NSW 2010, Australia
[4] CSIRO, Div Computat Informat, Sydney, NSW 2113, Australia
基金
英国医学研究理事会;
关键词
RNA SEQUENCING DATA;
D O I
10.1093/bioinformatics/btu036
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The initial steps in the analysis of next-generation sequencing data can be automated by way of software 'pipelines'. However, individual components depreciate rapidly because of the evolving technology and analysis methods, often rendering entire versions of production informatics pipelines obsolete. Constructing pipelines from Linux bash commands enables the use of hot swappable modular components as opposed to the more rigid program call wrapping by higher level languages, as implemented in comparable published pipelining systems. Here we present Next Generation Sequencing ANalysis for Enterprises (NGSANE), a Linux-based, high-performance-computing-enabled framework that minimizes overhead for set up and processing of new projects, yet maintains full flexibility of custom scripting when processing raw sequence data.
引用
收藏
页码:1471 / 1472
页数:2
相关论文
共 6 条
[1]   Count-based differential expression analysis of RNA sequencing data using R and Bioconductor [J].
Anders, Simon ;
McCarthy, Davis J. ;
Chen, Yunshun ;
Okoniewski, Michal ;
Smyth, Gordon K. ;
Huber, Wolfgang ;
Robinson, Mark D. .
NATURE PROTOCOLS, 2013, 8 (09) :1765-1786
[2]   Statistical Design and Analysis of RNA Sequencing Data [J].
Auer, Paul L. ;
Doerge, R. W. .
GENETICS, 2010, 185 (02) :405-U32
[3]   Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences [J].
Goecks, Jeremy ;
Nekrutenko, Anton ;
Taylor, James .
GENOME BIOLOGY, 2010, 11 (08)
[4]   Snakemake-a scalable bioinformatics workflow engine [J].
Koester, Johannes ;
Rahmann, Sven .
BIOINFORMATICS, 2012, 28 (19) :2520-2522
[5]   nestly-a framework for running software with nested parameter choices and aggregating results [J].
McCoy, Connor O. ;
Gallagher, Aaron ;
Hoffman, Noah G. ;
Matsen, Frederick A. .
BIOINFORMATICS, 2013, 29 (03) :387-388
[6]   Bpipe: a tool for running and managing bioinformatics pipelines [J].
Sadedin, Simon P. ;
Pope, Bernard ;
Oshlack, Alicia .
BIOINFORMATICS, 2012, 28 (11) :1525-1526