51% of non-canonical splice sites are not annotated in GENCODE

Scientists at the Pontifical Catholic University of Chile have uncovered the diversity of non-canonical splice sites at the human transcriptome using deep transcriptome profiling. They mapped a total of 3.7 billion human RNA-seq reads and developed a set of stringent filters to avoid false non-canonical splice site detections.

The scientists identified 184 splice sites with non-canonical dinucleotides and U2/U12-like consensus sequences. They selected 10 of the herein identified U2/U12-like non-canonical splice site events and successfully validated 9 of them via reverse transcriptase-polymerase chain reaction and Sanger sequencing. Analyses of the 184 U2/U12-like non-canonical splice sites indicate that 51% of them are not annotated in GENCODE. In addition, 28% of them are conserved in mouse and 76% are involved in alternative splicing events, some of them with tissue-specific alternative splicing patterns. Interestingly, their analysis identified some U2/U12-like non-canonical splice sites that are converted into canonical splice sites by RNA A-to-I editing. Moreover, the U2/U12-like non-canonical splice sites have a differential distribution of splicing regulatory sequences, which may contribute to their recognition and regulation. This analysis provides a high-confidence group of U2/U12-like non-canonical splice sites, which exhibit distinctive features among the total human splice sites.

rna-seq

Parada GE, Munita R, Cerda CA, Gysling K. (2014) A comprehensive survey of non-canonical splice sites in the human transcriptome. Nucleic Acids Res [Epub ahead of print]. [article]