RNA-seq is a reference technology for determining alternative splicing at genome-wide level. Exon arrays remain widely used for the analysis of gene expression, but show poor validation rate with regard to splicing events. Commercial arrays that include probes within exon junctions have been developed in order to overcome this problem.
Researchers from the University of Navarra compare the performance of RNA-seq (Illumina HiSeq) and junction arrays (Affymetrix Human Transcriptome array) for the analysis of transcript splicing events. Three different breast cancer cell lines were treated with CX-4945, a drug that severely affects splicing. To enable a direct comparison of the two platforms, the researchers adapted EventPointer, an algorithm that detects and labels alternative splicing events using junction arrays, to work also on RNA-seq data. Common results and discrepancies between the technologies were validated and/or resolved by over 200 PCR experiments. As might be expected, RNA-seq appears superior in cases where the technologies disagree, and is able to discover novel splicing events beyond the limitations of physical probe-sets. They observe a high degree of coherence between the two technologies, however, with correlation of EventPointer results over 0.90. Through decimation, the detection power of the junction arrays is equivalent to RNA-seq with up to 60 million reads. These results suggest, therefore, that exon-junction arrays are a viable alternative to RNA-seq for detection of alternative splicing events when focusing on well-described transcriptional regions.
A: The CEL or BAM files are the input data for each technology. The splicing graph for each gene is built using the array annotation files or directly using the sequenced reads. B: Each node in the splicing graph is splitted into two nodes that correspond to the start and end positions in the genome respectively. EventPointer identifies events within each gene and annotates the type of event. In the figure, among the events in the gene, an exon cassette is highlighted. C: Statistical significance of the events is computed. D: Finally, the top-ranked events are validated using PCR and the results visualized in IGV.