RNA-Seq and gene expression microarrays provide comprehensive profiles of gene activity, but lack of reproducibility has hindered their application. A key challenge in the data analysis is the normalization of gene expression levels, which is currently performed following the implicit assumption that most genes are not differentially expressed. Here, researchers from Aarhus University present a mathematical approach to normalization that makes no assumption of this sort. The researchers have found that variation in gene expression is much larger than currently believed, and that it can be measured with available assays. Their results also explain, at least partially, the reproducibility problems encountered in transcriptomics studies. They expect that this improvement in detection will help efforts to realize the full potential of gene expression profiling, especially in analyses of cellular processes involving complex modulations of gene expression.
MedianCD and SVCD normalization allowed to detect variation in gene expression of smaller magnitude than with Median and Quantile normalization
Boxplots display absolute values of DEG fold changes, for each treatment compared to the corresponding control, obtained after Median normalization (a), Quantile normalization (b), MedianCD normalization (c), and SVCD normalization (d). Boxplots are colored by treatment. Dashed horizontal lines indicate references of 1.5-fold and 2-fold changes.
Availability – MedianCD and SVCD normalization are available via the R package cdnormbio, installable from the GitHub repository https://github.com/carlosproca/cdnormbio.