Detecting and quantifying isoforms from RNA-seq data is an important but challenging task. The problem is often ill-posed, particularly at low coverage. One promising direction is to exploit several samples simultaneously.
A team led by researchers at MINES ParisTech have developed a new method for solving the isoform deconvolution problem jointly across several samples. They formulate a convex optimization problem that allows to share information between samples and that is solved efficiently. The researchers demonstrate the benefits of combining several samples on simulated and real data, and show that this approach outperforms pooling strategies and methods based on integer programming.
Multi-dimensional splicing graph with three samples. Each candidate isoform is a path from source node s to sink node t. Nodes denoted as grey squares correspond to ordered set of exons. Each read is assigned to a unique node, corresponding to the exact set of exons that it overlaps. Note that more than 2 exons can constitute a node, properly modeling reads spanning more than 2 exons. A vector of read counts (one component per sample) is then associated to each node of the graph. Note also that some components of a vector can be equal to zero.
Availability – The software and source code are available at http://cbio.ensmp.fr/flipflop