Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods

Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods.

To address this, researchers at Chalmers University of Technology, Sweden have developed the R package Piano, that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on they refine the GSA workflow by using modifications of the gene-level statistics. This enables them to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level.

Piano RNA-SeqThe researchers demonstrate their fully implemented workflow by investigating the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, they suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation.

Availability – Piano is available, together with a user manual, for download at www.sysbio.se/piano.

  • Väremo L, Nielsen J, Nookaew I. (2013) Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods. Nucleic Acids Res [Epub ahead of print]. [article]