Genomic instability is a hallmark of cancer and, as such, structural alterations and fusion genes are common events in the cancer landscape. RNA sequencing (RNA-Seq) is a powerful method for profiling cancers, but current methods for identifying fusion genes are optimised for short reads. JAFFA is a sensitive fusion detection method that outperforms other methods with reads of 100 bp or greater. JAFFA compares a cancer transcriptome to the reference transcriptome, rather than the genome, where the cancer transcriptome is inferred using long reads directly or by de novo assembling short reads.
JAFFA is based on the idea of comparing a sequenced transcriptome against a reference transcriptome. As a default, JAFFA uses transcripts from GENCODE as a reference. For all JAFFA modes, reads aligning to intronic or intergenic regions are first removed to improve computational performance. Sequences are then converted into a common form – tumour sequences – consisting of either assembled contigs or the reads themselves. These sequences are processed by a core set of fusion-finding steps. First, sequences are aligned to a reference transcriptome and those that align to multiple genes are selected. Second, read support is determined. Third, putative candidates are aligned to the genome to check the genomic position of breakpoints. Finally, JAFFA calculates characteristics of each fusion and uses this to prioritise candidates for validation.
Availability – https://github.com/Oshlack/JAFFA/wiki