NGSANE: A Lightweight Production Informatics Framework for High Throughput Data Analysis

The initial steps in the analysis of Next Generation Sequencing (NGS) data can be automated by way of software ‘pipelines’. However, individual components depreciate rapidly due to evolving technology and analysis methods, often rendering entire versions of production informatics pipelines obsolete. Constructing pipelines from Linux bash commands enables the use of hot swappable, modular components as opposed to the more rigid program-call wrapping by higher level languages, as implemented in comparable published pipelining systems.

Here researchers from the Garvan Institute of Medical Research, Australia present Next Generation Sequencing Analysis for Enterprises (NGSANE), a Linux-based, High Performance Computing (HPC) enabled framework that minimises overhead for set up and processing of new projects yet maintains full flexibility of custom scripting when processing raw sequence data.

rna-seq

Availability and Implementation: NGSANE is implemented in bash and publicly available under BSD (3-Clause) licence via GitHub at https://github.com/BauerLab/ngsane

CONTACT: Denis.Bauer@csiro.au

Buske FA, French HJ, Smith MA, Clark SJ, Bauer DC. (2014) NGSANE: A Lightweight Production Informatics Framework for High Throughput Data Analysis. Bioinformatics [Epub ahead of print]. [article]