Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking.
Here researchers from the Barcelona Institute of Science and Technology describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. They apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5′ or 3′, often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, the researchers show that RACE-Seq is an effective tool to annotate an organism’s deep transcriptome, and compares favourably to other targeted sequencing techniques.
Schematic overview of RACE-seq
Standard 5′ and 3′ RACE primers are designed to target exons of a gene and produce primary RACE products, which undergo a second round of RACE reactions using nested 5′ or 3′ RACE primers. Both standard and nested 5′ and 3′ RACE products are subjected to long-read sequencing, followed by mapping to the reference genome.