All RNA molecules are subject to post-transcriptional gene regulation, including mechanisms such as splicing, cleavage and polyadenylation, editing, transport, stability, and translation. These mechanisms rely on the specific recognition of functional RNA elements by RNA-binding proteins (RBPs) and a range of cross-linking and immunoprecipitation based sequencing-protocols (CLIP-Seq) have been developed to detect RBP target sites as well as RNA-modifications.
Researchers from the Max Delbrück Center for Molecular Medicine present omniCLIP, a probabilistic approach to identify RBB-RNA interaction sites from CLIP-data. The model presents a principled framework for the analysis of CLIP-Seq assays and takes into account several important new aspects. First, it jointly models the CLIP-Seq and background data in all replicates and accounts for confounding factors such as gene expression. Additionally, it uses an empirical Bayesian approach to identify and model important diagnostic events and sequencing errors. Finally, it models biological and technical variance. Overall, jointly modelling all information and uncertainties allows determining an accurate picture of the RNA-RBP interaction landscape.
The show that omniCLIP can be applied to PAR-CLIP, HITS-CLIP, iCLIP and eCLIP data and that it outperforms each method that they have compared it against. This is insofar remarkable as most competitor methods are tuned for specific protocols. This shows that omniCLIP can be easily applied to new protocols, as all parameters are learned from the data.
Another advantage of omniCLIP is that it models the data in a principled way, i.e. each of its components has a clear probabilistic interpretation. This enables an easy integration of other probabilistic models in omniCLIP, such as for binding motif, structure, for various biases or explicit models of additional confounding factors.
Consequently, omniCLIP greatly simplifies analysis of novel CLIP-seq data analysis, increases the reliability of results and can pave the way for integrative studies based on data from CLIP-Seq assays.
Availability – The software for omniCLIP can be obtained from: https://github.com/philippdre/omniCLIP under the GNU GPL license (v3). The version of source code used in this manuscript has been deposited at: https://doi.org/10.5281/zenodo.1320207.