Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument, and random measurement errors. Novel biological technologies, such as single-cell RNA-seq, are plagued with systematic errors that may severely affect statistical analysis if the data is not properly calibrated.
Yale University researchers propose a novel deep learning approach for removing systematic batch effects. This method is based on a residual neural network, trained to minimize the Maximum Mean Discrepancy (MMD) between the multivariate distributions of two replicates, measured in different batches. They apply this method to single-cell RNA-seq datasets, and demonstrate that it effectively attenuates batch effects.
Calibration of scRNA-seq
Top: t-SNE plots before (left) and after (right) calibration using MMD-ResNet. Bottom: Calibration of cells with high expression of Prkca. t-SNE plots before calibration (left), after calibration using Combat (middle) and MMD-ResNet (right).
Availability – the codes and data are publicly available at: https://github.com/ushaham/BatchEffectRemoval.git.
Contact – [email protected]