The importance of RNA splicing in numerous cellular processes is well established. However, an underappreciated aspect is the ability of the spliceosome to recognize a set of very small (3-30 nucleotide, 1-10 amino acid) exons named microexons. Despite their small size, microexons and their regulation through alternative splicing have now been shown to play critical roles in protein and system function. Researchers at Columbia University discuss the discovery of microexons over time and the mechanisms by which their splicing is regulated, including recent progress made through deep RNA sequencing.
Identification of microexons
Currently available bioinformatics tools for identifying microexons. (a) Salzberg/GMAP and OLego algorithms: (1) When comparing cDNA sequences (Salzberg/GMAP) or RNA‐Seq reads (OLego) to the reference genome, there will be unmappable insertions in the cDNA/RNA‐Seq reads corresponding to unannotated microexons. (2) These algorithms search for potential matches to these segments at high resolution and evaluate candidate splice sites in the reference genome to identify novel exons. (3) After identification of the microexon, the reads map correctly. (b) VAST‐TOOLS: (1) cDNA libraries are used to build an exon junction database. (2) All possible microexon candidates are enumerated in silico by searching pairs of splice site separated by 3–15 nt within known introns. (3) Read mapping for an RNA‐Seq library of interest is performed against this exon‐microexon‐exon junction database to detect microexons. (c) ATMap: (1) Mapping of RNA‐Seq reads to a reference cDNA database lacking a microexon results in unmappable insertions in the read. (2) ATMap returns to the reference genome to identify splice sites surrounding a region that matches the read. (3) After identification of the microexon, the reads map correctly.