microRNAs (miRNAs) are small non-coding RNAs (~22 nts) that are considered central post-transcriptional regulators of gene expression and key components in many pathological conditions. Next-Generation Sequencing (NGS) technologies have led to inexpensive, massive data production, revolutionizing every research aspect in the fields of biology and medicine. Particularly, small RNA-Seq (sRNA-Seq) enables small non-coding RNA quantification on a high-throughput scale, providing a closer look into the expression profiles of these crucial regulators within the cell.
Researchers from the University of Thessaly have developed DIANA-microRNA-Analysis-Pipeline (DIANA-mAP), a fully automated computational pipeline that allows the user to perform miRNA NGS data analysis from raw sRNA-Seq libraries to quantification and Differential Expression Analysis in an easy, scalable, efficient, and intuitive way. Emphasis has been given to data pre-processing, an early, critical step in the analysis for the robustness of the final results and conclusions. Through modularity, parallelizability and customization, DIANA-mAP produces high quality expression results, reports and graphs for downstream data mining and statistical analysis. In an extended evaluation, the tool outperforms similar tools providing pre-processing without any adapter knowledge.
DIANA-mAP preprocessing workflow
It is composed of three individual steps: In the Data Acquisition step, the user can download publicly available datasets from online repositories by providing their accession numbers. The Adapter Detection step either uses a provided adapter sequence or scans the dataset in order to infer the adapter sequence and identify it. The Quality Trimming/Adapter Removal step removes from the dataset low-quality sections and full or partial adapter sequences in order to cleanse the dataset for further analysis.
Availability – DIANA-mAP is free to use under the MIT License and can be acquired through GitHub (https://github.com/athalexiou/DIANA-mAP), It is available dockerized with no dependency installations or standalone, accompanied by an installation manual through Github.