Alternative cleavage and polyadenylation (APA) contributes to the diversity of mRNA 3′ ends, affecting post-transcriptional regulation by including or excluding cis-regulatory elements in mRNAs, altering their stability and translational efficiency. While APA analysis has been applied broadly in mixed populations of cells, the heterogeneity of APA among single cells has only recently begun to be explored. Researchers at the University of Colorado School of Medicine have developed an approach they termed scraps (Single Cell RNA PolyA Site Discovery), implemented as a user-friendly, scalable, and reproducible end-to-end workflow, to identify polyadenylation sites at near-nucleotide resolution in single cells using 10X Genomics and other TVN-primed single-cell RNA-seq (scRNA-seq) libraries. This approach, which performs best with long (>100bp) read 1 sequencing and paired alignment to the genome, is both unbiased relative to existing methods that utilize only read 2 and recovers more sites at higher resolution, despite the reduction in read quality observed on most modern DNA sequencers following homopolymer stretches. For libraries sequenced without long read 1, the researchers implemented a fallback approach using read 2-only alignments that performs similarly to their optimal approach, but recovers far fewer polyadenylation sites per experiment. scraps also enables assessment of internal priming capture events, which the researchers demonstrate occur commonly but at higher frequency during apoptotic 3′ RNA decay. They also provide an R package, scrapR, that integrates the results of the scaps pipeline with the popular Seruat single-cell analysis package. Refinement and expanded application of these approaches will further clarify the role of APA in single cells, as well as the effects of internal priming on expression measurements in scRNA-seq libraries.
scraps utilizes positional information in TVN-primed libraries to map cleavage and polyadenylation sites at single-base resolution
A. Generic library construction schematic for TVN-based assays (PAS-seq, 10X Genomics 3´ single-cell, Drop-seq, etc.). B. Schematic representation of the scraps workflow, highlighting the ability to use existing Cell Ranger BAMs, paired alignments (preferred), or new read 2-only alignments to both quantify reference sites and identify novel poly(A) sites.
Availability – Latest scraps source code is available at http://github.com/rnabioco/scraps Latest scrapR source code is available at http://github.com/rnabioco/scrapr