Alignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. However, it is unclear whether these state-of-the-art RNA-seq analysis pipelines can quantify small RNAs as accurately as they do with long RNAs in the context of total RNA quantification.
Researchers from the University of Texas at Austin comprehensively tested and compared four RNA-seq pipelines on the accuracies of gene quantification and fold-change estimation on a novel total RNA benchmarking dataset, in which small non-coding RNAs are highly represented along with other long RNAs. The four RNA-seq pipelines were of two commonly-used alignment-free pipelines and two variants of alignment-based pipelines. They found that all pipelines showed high accuracies for quantifying the expressions of long and highly-abundant genes. However, alignment-free pipelines showed systematically poorer performances in quantifying lowly-abundant and small RNAs.
Analysis pipelines and experimental design
The researchers used two pipelines each for the alignment-based and alignment-free approach. The alignment-based pipelines consisted of (A) a HISAT2+feature Counts pipeline using HISAT2 for aligning reads to the human genome and using feature Counts for gene counting, and (B) TGIRT-map, a customized pipeline for analyzing TGIRT-seq data. Two alignment-free tools, Kallisto and Salmon, were used for quantifying transcripts. For alignment-free tools, gene-level abundances were summarized by Tximport. All differentially-expressed gene callings were done by DESeq2