Portcullis – efficient and accurate detection of splice junctions from RNA-Seq

Next generation sequencing (NGS) technologies enable rapid and cheap genome-wide transcriptome analysis, providing vital information about gene structure, transcript expression and alternative splicing. Key to this is the the accurate identification of exon-exon junctions from RNA sequenced (RNA-Seq) reads. A number of RNA-Seq aligners capable of splitting reads across these splice junctions (SJs) have been developed, however, it has been shown that while they correctly identify most genuine SJs available in a given sample, they also often produce large numbers of incorrect SJs.

Researchers from the Earlham Institute describe the extent of this problem using popular RNA-Seq mapping tools, and present a new method, called Portcullis, to rapidly filter false SJs junctions derived from spliced alignments. They show that Portcullis distinguishes between genuine and false positive junctions to a high-degree of accuracy across different species, samples, expression levels, error profiles and read lengths. Portcullis is portable, efficient and to their knowledge is currently the only SJ prediction tool that reliably scales for use with large RNA-Seq datasets and large, highly-fragmented genomes, whilst delivering accurate SJs.

A high level view of the Portcullis pipeline

rna-seq

Input to Portcullis is a genome in  FastA  format  and  one  or  more  BAM  files  created  by  an  upstream  RNA-Seq mapping tool. The first stage ensures the alignments are correctly merged, sorted and indexed, then all junctions found in the input are analysed and output to disk. Next, the full set of junctions are filtered to remove likely false positives and are also output to disk. The user can either choose to run the full pipeline in one go, or each stage separately.

Availability – Project home page: https://github.com/TGAC/portcullis

Mapleson D, Venturini L, Kaithakottil G, Swarbreck D. (2018) Efficient and accurate detection of splice junctions from RNA-Seq with Portcullis. Gigascience [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.