Fusion genes play an important role in the tumorigenesis of many cancers. Next-generation sequencing (NGS) technologies have been successfully applied in fusion gene detection for the last several years, and a number of NGS-based tools have been developed for identifying fusion genes during this period. Most fusion gene detection tools based on RNA-seq data report a large number of candidates (mostly false positives), making it hard to prioritize candidates for experimental validation and further analysis. Selection of reliable fusion genes for downstream analysis becomes very important in cancer research. Researchers at the German Cancer Research Center have developed confFuse, a scoring algorithm to reliably select high-confidence fusion genes which are likely to be biologically relevant.
confFuse takes multiple parameters into account in order to assign each fusion candidate a confidence score, of which score ≥8 indicates high-confidence fusion gene predictions. These parameters were manually curated based on our experience and on certain structural motifs of fusion genes. Compared with alternative tools, based on 96 published RNA-seq samples from different tumor entities, this method can significantly reduce the number of fusion candidates (301 high-confidence from 8,083 total predicted fusion genes) and keep high detection accuracy (recovery rate 85.7%). Validation of 18 novel, high-confidence fusions detected in three breast tumor samples resulted in a 100% validation rate.
Identified fusion genes and recovery rate of validated fusions among different tools
One hundred and twenty-six fusions were previously validated by RT-PCR. Five methods (fusionMap, deFuse, deFuse-0.81, confFuse-6.5, and confFuse-8) performed similarly in terms of recovery rate. confFuse generated much less fusion candidates than the others (higher specificity) while identifying comparable number of validated fusions (similar sensitivity).
confFuse is a novel downstream filtering method that allows selection of highly reliable fusion gene candidates for further downstream analysis and experimental validations.
Availability – confFuse is available at https://github.com/Zhiqin-HUANG/confFuse