Sample size calculation for differential expression analysis of RNA-seq data under Poisson distribution

VanderbiltSample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilising RNA-seq technology.

In this report, researchers at Vanderbilt University propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closed-form sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that these methods work well, with achievement of desired power. Finally, these sample size calculation methods are applied to three real RNA-seq data sets.

  • Li CI, Su PF, Guo Y, Shyr Y. (2013) Sample size calculation for differential expression analysis of RNA-seq data under Poisson distribution. Int J Comput Biol Drug Des 6(4), 358-375. [article]