October 28th 4:00pm – 5:00pm
Presented by: Yun Li, PhD (of University of North Carolina)
Single-cell RNA sequencing (scRNA-seq) allows researchers to examine the transcriptome at the single-cell resolution and has been increasingly employed as technologies continue to advance. Due to technical and biological reasons unique to scRNA-seq data, clustering and batch effect correction are almost indispensable to ensure valid and powerful data analysis. Multiple methods have been proposed for these two important tasks. For clustering, we have found that different methods, including state-of-the-art methods such as Seurat, SC3, CIDR, SIMLR, t-SNE + k-means, yield varying results in terms of both the number of clusters and actual cluster assignments. We have developed ensemble methods, SAFE-clustering and SAME-clustering, that leverages hyper-graph partitioning algorithms and a mixture model-based approach respectively to produce more robust and accurate ensemble solution on top of clustering results from individual methods. For batch effect correction, we have developed methods based on supervised mutual nearest neighbor detection to harness the power of known cell type labels for certain single cells. We benchmarked all methods in various scRNA-seq datasets to demonstrate their utilities.
LEARN MORE