High-throughput single-cell technologies provide an unprecedented view into cellular heterogeneity, yet they pose new challenges in data analysis and interpretation. In this protocol, researchers from Stanford University describe the use of Spanning-tree Progression Analysis of Density-normalized Events (SPADE), a density-based algorithm for visualizing single-cell data and enabling cellular hierarchy inference among subpopulations of similar cells. It was initially developed for flow and mass cytometry single-cell data. SPADE is implemented and applied using an open-source R package that runs on Mac OS X, Linux and Windows systems. A typical SPADE analysis on a 2.27-GHz processor laptop takes ∼5 min.
Overview of the SPADE algorithm
(a) Flowchart for the use of SPADE on a simulated 3D flow cytometry data set. (b) 3D scatterplot of a simulated data set, which is constructed from four abundant subpopulations. (c) Density down-sampled version of b showing the same structure but with fewer cells. (d) Agglomerative hierarchical clustering on the down-sampled cells to generate cell clusters. (e) SPADE tree connecting all cell clusters. (f) SPADE tree color-coded according to the median expression intensities of markers of cells within each node, where each node is a cell cluster from d; each of the three trees represents the distribution of the different marker expression across all the cell clusters.
The researchers demonstrate the applicability of SPADE to single-cell RNA-seq data. They compare SPADE with recently developed single-cell visualization approaches based on the t-distribution stochastic neighborhood embedding (t-SNE) algorithm. They contrast the implementation and outputs of these methods for normal and malignant hematopoietic cells analyzed by mass cytometry and provide recommendations for appropriate use. Finally, the researchers provide an integrative strategy that combines the strengths of t-SNE and SPADE to infer cellular hierarchy from high-dimensional single-cell data.