Single-cell RNA-sequencing (scRNA-seq) facilitates identification of new cell types and gene regulatory networks as well as dissection of the kinetics of gene expression and patterns of allele-specific expression. However, to facilitate such analyses, separating biological variability from the high level of technical noise that affects scRNA-seq protocols is vital.
Here researchers from the EMBL-EBI describe and validate a generative statistical model that accurately quantifies technical noise with the help of external RNA spike-ins. Applying this approach to investigate stochastic allele-specific expression in individual cells, they demonstrate that a large fraction of stochastic allele-specific expression can be explained by technical noise, especially for lowly and moderately expressed genes, In fact, they predict that only 17.8% of stochastic allele-specific expression patterns are attributable to biological noise with the remainder due to technical noise.
With the help of external RNA spike-in molecules, added at the same quantity to each cell’s lysate, we first estimate four parameters capturing technical variability, which are the expectation and variance of capture (θ) and sequencing (γ) efficiency. Then, by the general variance decomposition formula, the total observed variance of read counts can be decomposed into technical (blue) and biological (green) variance terms. The estimate of biological variance can be obtained by subtracting technical variance terms from the total observed variance. Shot noise (or Poisson noise) is cell-to-cell variability that can be modelled by a Poisson process.