RNA sequencing (RNA-seq), a means of depicting the transcriptome, is being used with increasing frequency to characterize a growing array of conditions—everything from prenatal birth defects to disorders of the elderly. Yet the technique, which is a relatively new form of next-generation sequencing, has yet to win the full confidence of patients, clinicians, and researchers. Just how accurate is this form of sequencing?
To answer that question, two initiatives—one taken by the Sequence Quality Control (SEQC) Consortium and another by the Association of Biomolecular Resource Facilities (ABRF)—have undertaken a number of studies. These initiatives took up the challenge of rigorously defining the scope and sources of variation in RNA sequencing data.
Many of the findings produced by these initiatives appeared in the September issue of Nature Biotechnology, which focuses on the performance of RNA sequencing. In particular, the issue emphasized large-scale studies involving data generated using multiple sequencing sites, platforms, or protocols:
- “A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium”—This study, which assessed RNA-seq performance for junction discovery and differential expression profiling and compared it to microarray and quantitative PCR (qPCR) data using complementary metrics, concluded that RNA-seq can be “a versatile tool for relative expression profiling, with comparable or superior performance to microarrays in many applications given sufficient read depth and appropriate choice of analysis pipeline.”
- “Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study”—In this study, researchers carried out replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols on five sequencing platforms. The results: “high intraplatform and inter-platform concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms.”
- “The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance”—Noting that the concordance of RNA-seq with microarrays for genome-wide analysis of differential gene expression had not been rigorously assessed using a range of chemical treatment conditions, this study generated Illumina RNA-seq and Affymetrix microarray data from the same liver samples of rats exposed in triplicate to varying degrees of perturbation by 27 chemicals representing multiple modes of action. The authors of the study indicated that RNA-seq outperforms microarray (93% versus 75%) in the verification of differentially expressed genes, as assessed by quantitative PCR, with the gain mainly due to its improved accuracy for low-abundance transcripts. “Nonetheless, classifiers to predict MOAs perform similarly when developed using data from either platform. Therefore, the endpoint studied and its biological complexity, transcript abundance and the genomic application are important factors in transcriptomic research and for clinical and regulatory decision making.”