Arkas – Rapid reproducible RNA-Seq analysis

The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments.  Researchers from the Keck School of Medicine of USC offer cloud-scale RNAseq pipelines Arkas-Quantification, which deploys Kallisto for parallel cloud computations, and Arkas-Analysis, which annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata and calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The biologically informative downstream gene-set analysis maintains special focus on Reactome annotations while supporting ENSEMBL transcriptomes. The Arkas cloud quantification pipeline includes support for custom user-uploaded FASTA files, selection for bias correction and pseudoBAM output. The option to retain pseudoBAM output for structural variant detection and annotation provides a middle ground between de novo transcriptome assembly and routine quantification, while consuming a fraction of the resources used by popular fusion detection pipelines.  Illumina’s BaseSpace cloud computing environment, where these two applications are hosted, offers a massively parallel distributive quantification step for users where investigators are better served by cloud-based computing platforms due to inherent efficiencies of scale.

Arkas-Analysis Gene-Set Enrichment Plot.


Gene-Set enrichment output report, each point represents the differential mean activity of each gene-set with 95% confidence intervals. The X-axis are individual gene-sets. The Y-axis is the log2fold change

Availability –  The latest source code:

Colombo AR, J Triche T Jr, Ramsingh G. (2017) Arkas: Rapid reproducible RNAseq analysis. F1000Res 6:586. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.