The advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression. One of the most widespread applications of RNA-seq is to identify genes which are differentially expressed (DE) between two experimental conditions. Here, researchers from the University of Cambridge and the Sanger Institute present a discrete, distributional method for differential gene expression (D3E), a novel algorithm specifically designed for single-cell RNA-seq data. They use synthetic data to evaluate D3E, demonstrating that it can detect changes in expression, even when the mean level remains unchanged. D3E is based on an analytically tractable stochastic model, and thus it provides additional biological insights by quantifying biologically meaningful properties, such as the average burst size and frequency. They use D3E to investigate experimental data, and with the help of the underlying model, they directly test hypotheses about the driving mechanism behind changes in gene expression.
Overview of D3E. A) Graphical representation of the transcriptional bursting model. B) Example of a realization of the transcriptional bursting model with parameters _ = 1, _ = 10, = 100, and _ = 1 . In this regime, the gene exhibits a bursty behavior with a bimodal stationary distribution. C) Derivation of the biologically-relevant parameters from the parameters of the transcriptional bursting model. D) Flowchart of the D3E algorithm.
Availability – A command-line version of D3E written in Python can be downloaded from GitHub (https://github.com/hemberg-lab/D3E), and the source code is available under the GPL licence. Furthermore, there is also a web-version available at http://wwww.sanger.ac.uk/sanger/GeneRegulation_D3E. Due to the time required to run D3E, the web version limits the number of genes and cells that may be analyzed, and it can only use the method of moments for estimating parameters.