Quantifying circular RNA expression from RNA-seq data using model-based framework

Circular RNAs (circRNAs) are a class of non-coding RNAs that are widely expressed in various cell lines and tissues of many organisms. Although the exact function of many circRNAs is largely unknown, the cell type- and tissue-specific circRNA expression has implicated their crucial functions in many biological processes. Hence, the quantification of circRNA expression from high-throughput RNA-seq data is becoming important to ascertain. Although many model-based methods have been developed to quantify linear RNA expression from RNA-seq data, these methods are not applicable to circRNA quantification.

Here researchers from Southeast University and the University of Nevada proposed a novel strategy that transforms circular transcripts to pseudo-linear transcripts and estimates the expression values of both circular and linear transcripts using an existing model-based algorithm, Sailfish. The new strategy can accurately estimate transcript expression of both linear and circular transcripts from RNA-seq data. Several factors, such as gene length, amount of expression, and the ratio of circular to linear transcripts, had impacts on quantification performance of circular transcripts. In comparison to count-based tools, the new computational framework had superior performance in estimating the amount of circRNA expression from both simulated and real ribosomal RNA-depleted (rRNA-depleted) RNA-seq datasets. On the other hand, the consideration of circular transcripts in expression quantification from rRNA-depleted RNA-seq data showed substantial increased accuracy of linear transcript expression. The proposed strategy was implemented in a program named Sailfish-cir.

The new strategy of quantifying circular RNA and linear RNA expression from RNA-seq data

rna-seq

 (A) Transform a circular transcript to a pseudo-linear transcript; (B) The flow chart that quantifies the expression of both circular and linear RNA transcripts from high-throughput RNA-seq data. The two phases that perform model-based estimation, indexing and quantification, are adapted from Sailfish algorithm.

Availability: Sailfish-cir is freely available at https://github.com/zerodel/Sailfish-cir

Contact: wanjun.gu@gmail.com or tongz@medicine.nevada.edu

Li M, Xie X, Zhou J, Sheng M, Yin X, Ko EA, Zhou T, Gu W. (2017) Quantifying circular RNA expression from RNA-seq data using model-based framework. Bioinformatics [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.