Benchmarking principal component analysis for large-scale single-cell RNA-sequencing

Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory.

Researchers from RIKEN Center for Biosystems Dynamics Research review the existing fast and memory-efficient PCA algorithms and implementations and evaluate their practical application to large-scale scRNA-seq datasets. Their benchmark shows that some PCA algorithms based on Krylov subspace and randomized singular value decomposition are fast, memory-efficient, and more accurate than the other algorithms.

Summary of results

rna-seq

a Theoretical properties summarized by our literature review. b Properties related to each implementation. c Performance evaluated by benchmarking with real-world and synthetic datasets. d User-friendliness evaluated by some metrics

The researchers develop a guideline to select an appropriate PCA implementation based on the differences in the computational environment of users and developers.

Tsuyuzaki K, Sato H, Sato K, Nikaido I (2020) Benchmarking principal component analysis for large-scale single-cell RNA-sequencing. Gen Biol 21(9). [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.