Single cell transcriptomics is critical for understanding cellular heterogeneity and identification of novel cell types. Leveraging the recent advances in single cell RNA sequencing (scRNA-Seq) technology requires novel unsupervised clustering algorithms that are robust to high levels of technical and biological noise and scale to datasets of millions of cells.
University of Connecticut researchers present novel computational approaches for clustering scRNA-seq data based on the Term Frequency – Inverse Document Frequency (TF-IDF) transformation that has been successfully used in the field of text analysis.
Compared scRNA-Seq clustering methods
Empirical experimental results show that TF-IDF methods consistently outperform commonly used scRNA-Seq clustering approaches.