Long non-coding RNAs (lncRNAs) are typically expressed at low levels and are inherently highly variable. This is a fundamental challenge for differential expression (DE) analysis. In this study, Ghent University researchers comprehensively evaluated the performance of 25 pipelines for DE in RNA-seq data, with a particular focus on lncRNAs and low-abundance mRNAs. Fifteen performance metrics are used to evaluate DE tools and normalization methods using simulations and analyses of six diverse RNA-seq datasets.
Gene expression data are simulated using non-parametric procedures in such a way that realistic levels of expression and variability are preserved in the simulated data. Throughout the assessment, results for mRNA and lncRNA were tracked separately. All the pipelines exhibit inferior performance for lncRNAs compared to mRNAs across all simulated scenarios and benchmark RNA-seq datasets. The substandard performance of DE tools for lncRNAs applies also to low-abundance mRNAs. No single tool uniformly outperformed the others. Variability, number of samples, and fraction of DE genes markedly influenced DE tool performance.
Overall, linear modeling with empirical Bayes moderation (limma) and a non-parametric approach (SAMSeq) showed good control of the false discovery rate and reasonable sensitivity. Of note, for achieving a sensitivity of at least 50%, more than 80 samples are required when studying expression levels in realistic settings such as in clinical cancer research. About half of the methods showed a substantial excess of false discoveries, making these methods unreliable for DE analysis and jeopardizing reproducible science.
DE tools assessment work flow
The study has four components: evaluation of five normalization methods, concordance analysis of DE tools, evaluating the capability of DE tools to recover genes with known biological evidence of differential expression, and simulation procedures to study the statistical properties of DE tools, such as their ability to control the FDR and their sensitivity for the detection of differential expression. Six diverse types of RNA-seq datasets were used for comparison of the normalization methods and concordance analysis of DE tools. RNA-seq datasets were obtained from two cultured cell line datasets (CRC AZA and NGP nutlin), inbred animals (Bottomly and Hammer), normal human tissues (GTEx), and human cancer cells (Zhang). Three series of simulations were performed, each starting from a different RNA-seq source dataset: Zhang, NGP nutlin, and GTEx data. Results of the simulation study are made available through a user-friendly web application
The detailed results of our study can be consulted through a user-friendly web application, giving guidance on selection of the optimal DE tool (http://statapps.ugent.be/tools/AppDGE/).