RNA-Seq is an increasing used methodology to study either coding and non-coding RNA expression. There are many software tools available for each phase of the RNA-Seq analysis and each of them uses different algorithms. Furthermore, the analysis consists of several steps regarding alignment (primary-analysis), quantification, differential analysis (secondary-analysis) and any tertiary-analysis and can therefore be time-consuming to deal with each step separately, in addition to requiring a computer knowledge. For this reason, the development of an automated pipeline that allows the entire analysis to be managed through a single initial command and that is easy to use even for those without computer skills can be useful. Faced with the vast availability of RNA-Seq analysis tools, it is first of all necessary to select a limited number of pipelines to include. For this purpose, University of Perugia researchers compared eight pipelines obtained by combining the most used tools and for each one they evaluated peak of RAM, time, sensitivity and specificity.
The pipeline with shorter times, lower consumption of RAM and higher sensitivity is the one consisting in HISAT2 for alignment, featureCounts for quantification and edgeR for differential analysis. Here, the researchers have developed ARPIR, an automated pipeline that recurs by default to the cited pipeline, but it also allows to choose, between different tools, those of the pipelines having the best performances.
Workflow of ARPIR pipeline
The ARPIR pipeline, starting from the input files and parameters, performs an RNA-Seq analysis. First of all the primary-analysis, a quality control on the FastQ files occurs, followed by a pre-processing and alignment, which can be done through TopHat2, HISAT2 or STAR, finally there is a new quality control on the BAM files. The secondary-analysis is the quantification and differential analysis, which can follow the featureCounts-edgeR, featureCounts-DESeq2 or Cufflinks-cummeRbund pipelines. Then an optional tertiary-analysis follows, composed of a GO analysis and a Pathway analysis. The results obtained can then be viewed in a Shiny App and possibly downloaded to a report
ARPIR allows the analysis of RNA-Seq data from groups undergoing different treatment allowing multiple comparisons in a single launch and can be used either for paired-end or single-end analysis. All the required prerequisites can be installed via a configuration script and the analysis can be launched via a graphical interface or by a template script. In addition, ARPIR makes a final tertiary-analysis that includes a Gene Ontology and Pathway analysis. The results can be viewed in an interactive Shiny App and exported in a report (pdf, word or html formats). ARPIR is an efficient and easy-to-use tool for RNA-Seq analysis from quality control to Pathway analysis that allows you to choose between different pipelines.
Availability – Project home page: https://github.com/giuliospinozzi/arpir.