Alternative usage of transcript isoforms from the same gene has been hypothesized as an important feature in cancers. However, differential usage of gene transcripts between conditions – isoform switching – has not been comprehensively characterized in and across cancer types. To this end, University of Copenhangen researchers developed methods for identification and visualization of isoform switches with predicted functional consequences. Using these methods, they characterized isoform switching in RNA-seq data from >5500 cancer patients covering 12 solid cancer types. Isoform switches with potential functional consequences were common, affecting ~19% of multiple-transcript genes. Amongst these, isoform switches leading to loss of DNA sequence encoding protein domains were more frequent than expected, particularly in pan-cancer switches. The researchers identified several isoform switches as powerful biomarkers: 31 switches were highly predictive of patient survival independent of cancer types. These data constitute an important resource for cancer researchers, available through interactive web tools. Moreover, these methods, available as an R package, enable systematic analysis of isoform switches from other RNA-seq data sets.
Data and data analysis
A) Strategy for detecting of isoform switching from RNA-seq data in a single cancer cell type. Within each cancer type, the TCGA RNA-seq libraries can be divided by sample type, as illustrated in this classification tree. At the first level from the top, samples are divided depending on whether they are from healthy tissue or from tumors, as indicated by grey boxes. Such samples do not have to be paired (originating from the same patient). Isoform switch analysis between healthy and tumor samples on this level will be referred to as “unpaired” analysis. At the second level, we focus on those tumors and healthy tissue samples originating from the same patients, forming pairs. Isoform switch analysis between such paired samples will be referred to as “paired” analysis. Only isoform switches detected in both the paired and the unpaired analysis are considered for functional consequence prediction (panel B). B) Overview of computational pipeline for isoform detection and functional prediction. The pipeline allows analysis starting from full-length isoform quantification data from RNA-seq experiments (e.g. the output of Cufflinks(18), Kallisto(19) or similar). The pipeline allows the identification of isoform switches, followed by computational annotation of isoforms and the prediction of functional consequences. Because of the specific samples setup in the TCGA data (panel A), we used only the parts of the pipeline for annotation concatenation and prediction of functional consequences for the analysis of the TCGA data (indicated by dotted lines), only considering the set of predicted consequences listed in the grey area.
This study indicates that isoform switches with predicted functional consequences are common and important in dysfunctional cells, which in turn means that gene expression should be analyzed on the isoform level.
Availability – three interactive online webservices for easy and fast exploration of isoform switches in cancer, are available at: http://www.binf.ku.dk/services/#switch_cancer
IsoformSwitchAnalyzeR is available through Bioconductor: https://github.com/kvittingseerup/IsoformSwitchAnalyzeR