The ability to interrogate total RNA content of single cells would enable better mapping of the transcriptional logic behind emerging cell types and states. However, current single-cell RNA-sequencing (RNA-seq) methods are unable to simultaneously monitor all forms of RNA transcripts at the single-cell level, and thus deliver only a partial snapshot of the cellular RNAome.
Stanford University researchers have developed Smart-seq-total, a method capable of assaying a broad spectrum of coding and noncoding RNA from a single cell. Smart-seq-total does not require splitting the RNA content of a cell and allows the incorporation of unique molecular identifiers into short and long RNA molecules for absolute quantification. It outperforms current poly(A)-independent total RNA-seq protocols by capturing transcripts of a broad size range, thus enabling simultaneous analysis of protein-coding, long-noncoding, microRNA, and other noncoding RNA transcripts from single cells. The researchers used Smart-seq-total to analyze the total RNAome of human primary fibroblasts, HEK293T, and MCF7 cells, as well as that of induced murine embryonic stem cells differentiated into embryoid bodies. By analyzing the coexpression patterns of both noncoding RNA and mRNA from the same cell, the researchers were able to discover new roles of noncoding RNA throughout essential processes, such as cell cycle and lineage commitment during embryonic development. Moreover, they show that independent classes of short-noncoding RNA can be used to determine cell-type identity.
Smart-seq-total performance
(A) Schematic comparison of Smart-seq2 and Smart-seq-total pipelines. Following cell lysis, total cellular RNA is polyadenylated, primed with anchored oligodT, and reverse transcribed in a presence of the custom degradable TSO. After reverse transcription, TSO is enzymatically cleaved, single-stranded cDNA is amplified and cleaned up. Amplified cDNA is then either tagmented or directly indexed, pooled, and depleted from ribosomal sequences using DASH. The resulting indexed libraries are then pooled and sequenced on Illumina platform. (B) Distribution of mapped reads across RNA types in human primary fibroblasts, HEK293T, and MCF7 cells. Percentage of total reads mapped to each RNA type. miscRNA class is additionally split into RN7SK, RN7SL, and other miscRNA categories. (C) Examples of coding and noncoding marker genes for each cell type. Top exemplary markers per biotype computed among cell types using Wilcoxon rank sum test. RNY1 belongs to miscRNA, SCARNA23 and SCARNA20 to scaRNA, MT-TD to mitochondrial tRNA class. (D) t-SNE plots of three profiled human cell types generated using indicated subset of genes. From top to bottom: protein coding, lncRNA, miRNA, and other small ncRNA (include snoRNA, snRNA, scaRNA, scRNA, and miscRNA). We have excluded histone coding genes from the protein coding (polyA+) set, since a large fraction of these RNAs are known to lack polyA tails
Availability – All code used for analysis is available on GitHub (https://github.com/aisakova/smart-seq-total/).