RNA-SeQC – a series of quality control metrics for RNA-seq data

RNA-SeQC is a java program which computes a series of quality control metrics for RNA-seq data. The input can be one or more BAM files. The output consists of HTML reports and tab delimited files of metrics data. This program can be valuable for comparing sequencing quality across different samples or experiments to evaluate different experimental parameters. It can also be run on individual samples as a means of quality control before continuing with downstream analysis.

Quality Control Metrics Include:

  • Read Counts
    • Total, unique, duplicate reads
    • Mapped reads and mapped unique reads
    • rRNA reads
    • Transcript-annotated reads (intragenic, intergenic, exonic, intronic)
    • Expression profiling efficiency (ratio of exon-derived reads to total reads sequenced)
    • Strand specificity
  • Coverage
    • Mean coverage (reads per base)
    • Mean coefficient of variation
    • 5’/3′ bias
    • Coverage gaps: count, length
    • Coverage Plots
  • Downsampling
  • GC Bias
  • Correlation to reference expression profile

Availability: RNA-SeQC can be run online using the GenePattern genomic analysis platform or it can be downloaded and run locally.