SERE – Single-parameter quality control and sample comparison for RNA-Seq

The Simple Error Ratio Estimate (SERE) is a single-parameter test procedure for count data that can determine whether two RNA-Seq libraries are faithful replicates or globally different.

  • Interpretation of SERE is unambiguous regardless of the total read count or the range of expression differences among bins (exons or genes), a score of 1 indicating faithful replication (i.e., samples are affected only by Poisson variation of individual counts), a score of 0 indicating data duplication, and scores >1 corresponding to true global differences between RNA-Seq libraries.
  • On the contrary the interpretation of Pearson’s r is generally ambiguous and highly dependent on sequencing depth and the range of expression levels inherent to the sample (difference between lowest and highest bin count).
  • Cohen’s simple Kappa results are also ambiguous and are highly dependent on the choice of bins.

For quantifying global sample differences SERE performs similarly to a measure based on the negative binomial distribution yet is simpler to compute. SERE can therefore serve as a straightforward and reliable statistical procedure for the global assessment of pairs or large groups of RNA-Seq datasets by a single statistical parameter.

