SIMLR – Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

Single-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical to identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts.

Here, researchers from Stanford University propose a novel similarity-learning framework, SIMLR (single-cell interpretation via multi-kernel learning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization. They show that SIMLR separates subpopulations more accurately in single-cell data sets than do existing dimension reduction methods. Additionally, SIMLR demonstrates high sensitivity and accuracy on high-throughput peripheral blood mononuclear cells (PBMC) data sets generated by the GemCode single-cell technology from 10x Genomics.

Outline of SIMLR.rna-seq

(a) SIMLR learns a proper metric for the cell-to-cell distances using the gene expression and constructs a similarity matrix. (b) The similarity matrix is used for visualization of cells in 2-D and for dimension reduction for clustering.

Wang B, Zhu J, Pierson E, Batzoglou S. (2016) Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. bioRXiv [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.