The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments. Researchers from the Keck School of Medicine of USC offer cloud-scale RNAseq pipelines Arkas-Quantification, which deploys Kallisto for parallel cloud computations, and Arkas-Analysis, which annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata and calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The biologically informative downstream gene-set analysis maintains special focus on Reactome annotations while supporting ENSEMBL transcriptomes. The Arkas cloud quantification pipeline includes support for custom user-uploaded FASTA files, selection for bias correction and pseudoBAM output. The option to retain pseudoBAM output for structural variant detection and annotation provides a middle ground between de novo transcriptome assembly and routine quantification, while consuming a fraction of the resources used by popular fusion detection pipelines. Illumina’s BaseSpace cloud computing environment, where these two applications are hosted, offers a massively parallel distributive quantification step for users where investigators are better served by cloud-based computing platforms due to inherent efficiencies of scale.
Arkas-Analysis Gene-Set Enrichment Plot.
Gene-Set enrichment output report, each point represents the differential mean activity of each gene-set with 95% confidence intervals. The X-axis are individual gene-sets. The Y-axis is the log2fold change
Availability – The latest source code: https://github.com/RamsinghLab/Arkas-RNASeq