The cost of RNA-Seq has been decreasing over the last few years. Despite this, experiments with four or less biological replicates are still quite common. Estimating the variances of gene expression estimates becomes both a challenging and interesting problem in these situations of low replication. However, with the wealth of microarray and other publicly available gene expression data readily accessible on public repositories, these sources of information can be leveraged to make improvements in variance estimation.
A team led by researchers at the University of Sydney, Australia have developed a novel approach called Tshrink+ for inferring differential gene expression through improved modelling of the gene-wise variances. Existing methods share information between genes of similar average expression by shrinking, or moderating, the gene-wise variances to a fitted common variance. They have been able to achieve improved estimation of the common variance by using gene-wise sample variances from external experiments, as well as gene length.
Using biological data, the team shows that utilising additional external information can improve the modelling of the common variance and hence the calling of differentially expressed genes. These sources of additional information include gene length and gene-wise sample variances from other RNA-Seq and microarray datasets, of both related and seemingly unrelated tissue types. The results of this are promising, with their differential expression test, Tshrink+, performing favourably when compared to existing methods such as DESeq and edgeR when considering both gene ranking and sensitivity. These improved variance models could easily be implemented in both DESeq and edgeR and highlight the need for a database that offers a profile of gene variances over a range of tissue types and organisms.
Availability: This method is implemented in the R package sydSeq available on http://www.maths.usyd.edu.au/u/jeany/software.htm
- Patrick E, Buckley M, Lin DM, Yang YH. (2013) Improved moderation for gene-wise variance estimation in RNA-Seq via the exploitation of external information. BMC Genomics Suppl 1, S9. [article]