Several multicenter benchmark data sets represent valuable steps toward using RNA-seq as a diagnostic tool with clinical utility.
RNAs are excellent candidates for monitoring and diagnosing disease. In recent years, high-throughput RNA sequencing (RNA-seq) has opened new possibilities for determining global expression patterns as well as identifying low-level altered transcripts. But in order for RNA-Seq to transition from a discovery tool to a diagnostic tool with clinical utility, the field must establish standard analysis methods and benchmark data sets for assessing analytical accuracy and reproducibility.
Building on earlier efforts, such as the MAQC-II study for microarray expression data, results reported in this issue describe two major collaborations establishing standards for RNA-seq—the US Food and Drug Administration (FDA)’s Sequencing Quality Control (SEQC) project and the Association of Biomolecular Resource Facilities (ABRF) next-generation sequencing study on RNA-Seq. The papers provide assessments of sequencing platforms, experimental protocols and data analysis approaches across the collaborative sites.
“The authors identify library preparation as a major source of false positives and put forward several metrics that should be monitored, including GC content distribution, gene-body coverage uniformity, average error rate and insert size.”
“We anticipate new problems and terminology, such as “transcript of unknown significance” mirroring the established “variant of unknown significance.”