RNA-SeQC for RNA-Seq Datasets

RNA-SeQC is a program which provides key measures of data quality for RNA-seq datasets. These metrics include yield, alignment and duplication rates; GC bias, rRNA content, regions of alignment (exon, intron, intragenic), continuity of coverage, 3’/5’ bias, and count of detectable transcripts, among others. The software provides multi-sample evaluation of library construction protocols, input materials and other experimental parameters. The modularity of the software enables pipeline integration and the routine monitoring of key measures of data quality such as the number of alignable reads, duplication rates and rRNA contamination. RNA-SeQC allows investigators to make informed decisions about sample inclusion in downstream analysis. In summary, RNA-SeQC provides quality control measures critical to experiment design, process optimization and downstream computational analysis.

Availability and Implementation: See www.genepattern.org to run online, or www.broadinstitute.org/rna-seqc/ for a command line tool.

  • DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, Reich M, Winckler W, Getz G. (2012) RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics [Epub ahead of print]. [abstract]