GOexpress accepts gene expression datasets obtained from both microarray and RNA-seq platforms formatted in the recommended Bioconductor “ExpressionSet” container, to evaluate the power of each feature expressed in the dataset to cluster biological samples according to known experimental factors. In a second step, genes associated with a common ontology (defaults to Ensembl BioMart annotations) are then summarized to identify GO terms clustering that best cluster the same biological samples.
“The integration of expression values with gene ontology analysis makes GOexpress stand apart from most Gene Ontology analysis tools seeking enrichment of GO terms within lists of gene names”, say Kevin Rue-Albrecht, one of the tool’s developers.
GOexpress enables the analysis of both continuous (e.g. time-series, drug concentration) and categorical (e.g. treatment, condition), to identify gene expression profiles – and GO terms – most consistently clustering samples across all data-points. Presently, GOexpress does not provide inferential statistics such as p-values. Instead, the use of the randomForest algorithm inherently allows competition between the expressed gene features in the dataset, to rank them by decreasing order of clustering power, hence prioritizing the visualization of top-ranked genes and GO terms. A one-way ANOVA is available as an alternative statistical framework, while other statistical tests may be added upon suggestion.
Finally, GOexpress offers various data-driven plotting functions which easily integrate with widely used bioinformatics tools such as edgeR and DEseq to visualize the gene expression profile of differentially expressed genes.
Availability – GOexpress is available as a Bioconductor package. The currently recommended version of GOexpress is the development version 1.1.5, including new stable features and a clearer manual than the release version. http://www.bioconductor.org/packages/devel/bioc/html/GOexpress.html