RNA-Seq technology measures the transcript abundance by generating sequence reads and counting their frequencies across different biological conditions. To identify differentially expressed genes between two conditions, it is important to consider the experimental design as well as the distributional property of the data. In many RNA-Seq studies, the expression data are obtained as multiple pairs, e.g., pre- vs. post-treatment samples from the same individual. We seek to incorporate paired structure into analysis.
Now, a team led by researchers at Yale University have developed a Bayesian hierarchical mixture model for RNA-Seq data to separately account for the variability within and between individuals from a paired data structure. The method assumes a Poisson distribution for the data mixed with a gamma distribution to account variability between pairs. The effect of differential expression is modeled by two-component mixture model. The performance of this approach is examined by simulated and real data.
In this setting, the proposed model provides higher sensitivity than existing methods to detect differential expression. Application to real RNA-Seq data demonstrates the usefulness of this method for detecting expression alteration for genes with low average expression levels or shorter transcript length.
Availability: The method was implemented in R and is available at http://bioinformatics.med.yale.edu
- Chung LM, Ferguson JP, Zheng W, Qian F,Bruno V, Montgomery RR, Zhao H(2013) Differential expression analysis for paired RNA-seq data. BMC Bioinformatics 14, 110. [abstract]