Comprehensive evaluation of noise reduction methods for single-cell RNA sequencing data

Normalization and batch correction are critical steps in processing single-cell RNA sequencing (scRNA-seq) data, which remove technical effects and systematic biases to unmask biological signals of interest. Although a number of computational methods have been developed, there is no guidance for choosing appropriate procedures in different scenarios.

Researchers at the Vanderbilt University Medical Center assessed the performance of 28 scRNA-seq noise reduction procedures in 55 scenarios using simulated and real datasets. The scenarios accounted for multiple biological and technical factors that greatly affect the denoising performance, including relative magnitude of batch effects, the extent of cell population imbalance, the complexity of cell group structures, the proportion and the similarity of nonoverlapping cell populations, dropout rates and variable library sizes. The researchers used multiple quantitative metrics and visualization of low-dimensional cell embeddings to evaluate the performance on batch mixing while preserving the original cell group and gene structures.

The evaluation workflow

The evaluation workflow. (A) Studied scenarios and datasets. (B) Normalization and batch correction methods included in this study. (C) Adjustment performances assessed by both visualization and quantitative metrics. Quantitative metrics were summarized by a circle plot. Each circle in the plot represented the evaluation result of a procedure (row) measured by a certain metric (column). The size of the circle was determined by the metric score. The color of the circle was determined by the relative change of the score compared with the baseline (unadjusted). Red suggests an increase, while blue indicates a decrease and gray means unchanged in scores. The darker the circle, the more improved or worsened are the scores.

(A) Studied scenarios and datasets. (B) Normalization and batch correction methods included in this study. (C) Adjustment performances assessed by both visualization and quantitative metrics. Quantitative metrics were summarized by a circle plot. Each circle in the plot represented the evaluation result of a procedure (row) measured by a certain metric (column). The size of the circle was determined by the metric score. The color of the circle was determined by the relative change of the score compared with the baseline (unadjusted). Red suggests an increase, while blue indicates a decrease and gray means unchanged in scores. The darker the circle, the more improved or worsened are the scores.

Based on their results, the researchers specified technical or biological factors affecting the performance of each method and recommended proper methods in different scenarios. In addition, they highlighted one challenging scenario where most methods failed and resulted in overcorrection. These studies not only provided a comprehensive guideline for selecting suitable noise reduction procedures but also pointed out unsolved issues in the field, especially the urgent need of developing metrics for assessing batch correction on imperceptible cell-type mixing.

Chu SK, Zhao S, Shyr Y, Liu Q. (2022) Comprehensive evaluation of noise reduction methods for single-cell RNA sequencing data. Briefings in Bioinformatics [Epub ahead of print]. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.