Mammalian gene expression is inherently stochastic, and results in discrete bursts of RNA molecules that are synthesized from each allele. Although transcription is known to be regulated by promoters and enhancers, it is unclear how cis-regulatory sequences encode transcriptional burst kinetics. Characterization of transcriptional bursting, including the burst size and frequency, has mainly relied on live-cell or single-molecule RNA fluorescence in situ hybridization recordings of selected loci.
Karolinska Institutet researchers determine transcriptome-wide burst frequencies and sizes for endogenous mouse and human genes using allele-sensitive single-cell RNA sequencing. They show that core promoter elements affect burst size and uncover synergistic effects between TATA and initiator elements, which were masked at mean expression levels. Notably, they provide transcriptome-wide evidence that enhancers control burst frequencies, and demonstrate that cell-type-specific gene expression is primarily shaped by changes in burst frequencies. Together, these data show that burst frequency is primarily encoded in enhancers and burst size in core promoters, and that allelic single-cell RNA sequencing is a powerful model for investigating transcriptional kinetics.
Transcriptome-wide inference of transcriptional burst kinetics
a, Allele-resolution kinetics inferred from scRNA-seq data. The total expression for the Mbln2 gene (top) was separated into allelic expression (maternal: middle; paternal: bottom). Inference was performed independently on total expression and allele-level expression to illustrate that allele-level inference has the required resolution, with expression measured as observed RNA molecules. b, Inferred burst kinetics for each gene (CAST allele) in primary fibroblasts (red dots, 7,186 genes). Blue contours indicate the inference precision defined as the width of the confidence interval divided by the point estimate from simulated observations. Burst size in units of observed RNA molecules. c, Histogram of inferred burst frequencies for CAST allele in primary fibroblasts, in timescale of mRNA degradation rate. d, Histogram of inferred burst sizes (observed RNA molecules) for CAST allele in primary fibroblasts. e, Scatter plot comparing inferred burst frequencies with gene-specific mRNA degradation rates (x axis) against inferred burst frequencies that did not use mRNA degradation rates (using the average degradation rate for all genes). Genes with the 50 longest (green) and shortest (red) mRNA degradation rates are marked. Data from ES cells and CAST allele. f, Histogram of allele-level waiting times between bursts (data from ES cells and CAST allele). g, Scatter plot showing the inferred gene inactivation (koff) and activation (kon) rates, highlighting that genes have higher koff than kon values. Data from fibroblast and CAST allele.