Benchmarking algorithms for joint integration of unpaired and paired single-cell RNA-seq and ATAC-seq data

Single-cell RNA-sequencing (scRNA-seq) measures gene expression in single cells, while single-nucleus ATAC-sequencing (snATAC-seq) quantifies chromatin accessibility in single nuclei. These two data types provide complementary information for deciphering cell types and states. However, when analyzed individually, they sometimes produce conflicting results regarding cell type/state assignment. The power is compromised since the two modalities reflect the same underlying biology. Recently, it has become possible to measure both gene expression and chromatin accessibility from the same nucleus. Such paired data enable the direct modeling of the relationships between the two modalities. Given the availability of the vast amount of single-modality data, it is desirable to integrate the paired and unpaired single-modality datasets to gain a comprehensive view of the cellular complexity.

University of Pennsylvania researchers benchmarked nine existing single-cell multi-omic data integration methods. Specifically, the researchers evaluated to what extent the multiome data provided additional guidance for analyzing the existing single-modality data, and whether these methods uncover peak-gene associations from single-modality data. Their results indicate that multiome data are helpful for annotating single-modality data. However, the researchers emphasize that the availability of an adequate number of nuclei in the multiome dataset is crucial for achieving accurate cell type annotation. Insufficient representation of nuclei may compromise the reliability of the annotations. Additionally, when generating a multiome dataset, the number of cells is more important than sequencing depth for cell type annotation.

Outline of the benchmarking evaluations

Fig. 1

A Scheme to evaluate if multiome data help the integration of single-modality data. B Scenarios simulated to evaluate multi-omic integration methods

Seurat v4 is the best currently available platform for integrating scRNA-seq, snATAC-seq, and multiome data even in the presence of complex batch effects.

Lee MYY, Kaestner KH, Li M. (2023) Benchmarking algorithms for joint integration of unpaired and paired single-cell RNA-seq and ATAC-seq data. Genome Biol 24(1):244. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.