Optimizing a Massive Parallel Sequencing Workflow for Quantitative miRNA Expression Analysis

Optimized microRNA differential expression analysis workflow for digital data

Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated.

Researchers at the Department of Computer Sciences, University di Torino, Italy set out to define a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

They merged a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells to assemble a benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments.

They observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression.

  • Cordero F, Beccuti M, Arigoni M, Donatelli S, Calogero RA. (2012) Optimizing a Massive Parallel Sequencing Workflow for Quantitative miRNA Expression Analysis. PLoS One 7(2), e31630. [article]