The Level of Residual Dispersion Variation and the Power of Differential Expression Tests for RNA-Seq Data

RNA-Sequencing (RNA-Seq) has been widely adopted for quantifying gene expression changes in comparative transcriptome analysis. For detecting differentially expressed genes, a variety of statistical methods based on the negative binomial (NB) distribution have been proposed. These methods differ in the ways they handle the NB nuisance parameters (i.e., the dispersion parameters associated with each gene) to save power, such as by using a dispersion model to exploit an apparent relationship between the dispersion parameter and the NB mean. Presumably, dispersion models with fewer parameters will result in greater power if the models are correct, but will produce misleading conclusions if not.

A new study by researchers at Oregon State University investigates this power and robustness trade-off by assessing rates of identifying true differential expression using the various methods under realistic assumptions about NB dispersion parameters. Thier results indicate that the relative performances of the different methods are closely related to the level of dispersion variation unexplained by the dispersion model. The researchers propose a simple statistic to quantify the level of residual dispersion variation from a fitted dispersion model and show that the magnitude of this statistic gives hints about whether and how much we can gain statistical power by a dispersion-modeling approach.

rna-seq

The left panel is for the control group and the right panel is for the E2-treated group

 

Mi G, Di Y (2015) The Level of Residual Dispersion Variation and the Power of Differential Expression Tests for RNA-Seq Data. PLoS ONE 10(4): e0120117. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.