Karolinska Institutet researchers have developed FuSeq, a fast and accurate method to discover fusion genes based on quasi-mapping to quickly map the reads, extract initial candidates from split reads and fusion equivalence classes of mapped reads, and finally apply multiple filters and statistical tests to get the final candidates. The researchers apply FuSeq to four validated datasets: breast cancer, melanoma and glioma datasets, and one spike-in dataset. The results reveal high sensitivity and specificity in all datasets, and compare well against other methods such as FusionMap, TRUP, TopHat-Fusion, SOAPfuse and JAFFA. In terms of computational time, FuSeq is two-fold faster than FusionMap and orders of magnitude faster than the other methods. With this advantage of less computational demands, FuSeq makes it practical to investigate fusion genes in large numbers of samples.
FuSeq pipeline for fusion gene detection
FuSeq pipeline for fusion gene detection: quasi-mapping of read pairs to extract mapped reads and split reads; statistical tests and filtering to eliminate false positive fusion genes; collecting and merging fusion gene candidates from mapped reads and split reads; de novo assembly to verify and determine fusion sequences; and exporting information of final candidates to files
Availability – FuSeq is implemented in C++ and R, and available at https://github.com/nghiavtr/FuSeq for non-commercial uses.