PANDORA – Systematic integration of RNA-Seq statistical algorithms

RNA-Seq is gradually becoming the standard tool for transcriptomic expression studies in biological research. Although considerable progress has been recorded in the development of statistical algorithms for the detection of differentially expressed genes using RNA-Seq data, the list of detected genes can differ significantly between algorithms.

Researchers at the Alexander Fleming Biomedical Sciences Research Center present a new method (PANDORA) that combines multiple algorithms toward a summarized result, more efficiently reflecting true experimental outcomes. This is achieved through the systematic combination of several analysis algorithms, by weighting their outcomes according to their performance with realistically simulated data sets generated from real data. Results supported by the analysis of both simulated and real data from different organisms as well as correlation with PolII occupancy demonstrate that PANDORA improves the detection of differential expression. It accomplishes this by optimizing the tradeoff between standard performance measurements, such as precision and sensitivity.

rna-seq

Tradeoff between accuracy (F1-scores) and correlation with PolII occupancy for all methods

Availability – PANDORA is implemented in metaseqR, a Bioconductor (http://www.bioconductor.org) package for the analysis of RNA-Seq gene expression data providing an interface for several normalization methods and statistical tests, methods for combining statistical tests as well as detailed and comprehensive reporting facilities.

Moulos P, Hatzis P. (2014) Systematic integration of RNA-Seq statistical algorithms for accurate detection of differential gene expression patterns. Nucleic Acids Res [Epub ahead of print]. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.