RNA sequencing has become a ubiquitous technology used throughout life sciences as an effective method of measuring RNA abundance quantitatively in tissues and cells. The increase in use of RNA-seq technology has led to the continuous development of new tools for every step of analysis from alignment to downstream pathway analysis. However, effectively using these analysis tools in a scalable and reproducible way can be challenging, especially for non-experts.
Using the workflow management system Snakemake, researchers from the Dana-Farber Cancer Institute have developed a user friendly, fast, efficient, and comprehensive pipeline for RNA-seq analysis. VIPER (Visualization Pipeline for RNA-seq analysis) is an analysis workflow that combines some of the most popular tools to take RNA-seq analysis from raw sequencing data, through alignment and quality control, into downstream differential expression and pathway analysis. VIPER has been created in a modular fashion to allow for the rapid incorporation of new tools to expand the capabilities. This capacity has already been exploited to include very recently developed tools that explore immune infiltrate and T-cell CDR (Complementarity-Determining Regions) reconstruction abilities. The pipeline has been conveniently packaged such that minimal computational skills are required to download and install the dozens of software packages that VIPER uses.
Overview of the full workflow performed by VIPER (Visualization Pipeline for RNAseq analysis
The different segments of the pipeline are broken down by color. The core of the pipeline is the read alignment performed by STAR that outputs alignment (bam) files. Gene expression is quantitated with Cufflinks for unsupervised analysis (clustering and PCA). STAR also generates a count matrix used for supervised analysis (differential expression with DESeq2). When a publically available analysis tool is used for a particular step, the name of the tool is identified above the arrow leading to the resulting output (boxed). When there is no tool indicated next to an arrow, the analysis step was performed with custom R code. Conditional/optional analyses are denoted with a hashed arrow and outlining box and represent the most distinguishing functionality for VIPER
Availability -The software described in this article are publically available online. Project home page: https://bitbucket.org/cfce/viper/