Circular RNAs (circRNAs) are generated by backsplicing of immature RNA forming covalently closed loops of intron/exon RNA molecules. Pervasiveness, evolutionary conservation, massive and regulated expression, and posttranscriptional regulatory roles of circRNAs in eukaryotes have been appreciated and described only recently. Moreover, being easily detectable disease markers, circRNAs undoubtedly represent a molecular class with high bearing on molecular pathobiology. CircRNAs can be detected from RNAseq data using appropriate computational methods to identify the sequence reads spanning backsplice junctions that do not colinearly map to the reference genome. To this end, several programs were developed and critical assessment of various strategies and tools suggested the combination of at least two methods as good practice to guarantee robust circRNA detection.
Researchers at the University of Padova have developed CirComPara, an automated bioinformatics pipeline, to detect, quantify and annotate circRNAs from RNAseq data using in parallel four different methods for backsplice identification. CirComPara also provides quantification of linear RNAs and gene expression, ultimately comparing and correlating circRNA and gene/transcript expression level. The developers applied their method to RNAs-eq data of monocyte and macrophage samples in relation to haploinsufficiency of the RNAbinding splicing factor Quaking (QKI). The biological relevance of the results, in terms of number, types and variations of circRNAs expressed, illustrates CirComPara potential to enlarge the knowledge of the transcriptome, adding details on the circRNAome, and facilitating further computational and experimental studies.
A) CirComPara workflow. Round corner boxes represent inputs; currently used tools are represented by gray labels next to the relative pipeline level; dotted lines represent optional functions; (B–D) CirComPara summary plots of circular RNAs (circRNAs) expressed; (B) absolute number of circRNAs detected by each method and (C) commonly detected by two or more methods; (D) number of circRNAs expressed per sample, considering the whole set of detected back-splices and the selected subset of circRNAs detected by at least two methods.
Availability – http://github.com/egaffo/CirComPara