Researchers at the University of Alabama at Birmingham have developed Polar Gini Curve, a method for characterizing cluster markers by analyzing single-cell RNA sequencing (scRNA-seq) data. Polar Gini Curve combines the gene expression and the 2D coordinates (“spatial”) information to detect patterns of uniformity in any clustered cells from scRNA-seq data.
The researchers demonstrate that Polar Gini Curve can help users characterize the shape and density distribution of cells in a particular cluster, which can be generated during routine scRNA-seq data analysis. To quantify the extent to which a gene is uniformly distributed in a cell cluster space, they combine two polar Gini curves (PGCs)—one drawn upon the cell-points expressing the gene (the “foreground curve”) and the other drawn upon all cell-points in the cluster (the “background curve”). They show that genes with highly dissimilar foreground and background curves tend not to uniformly distributed in the cell cluster—thus having spatially divergent gene expression patterns within the cluster. Genes with similar foreground and background curves tend to uniformly distributed in the cell cluster—thus having uniform gene expression patterns within the cluster. Such quantitative attributes of PGCs can be applied to sensitively discover biomarkers across clusters from scRNA-seq data. The researchers demonstrate the performance of the Polar Gini Curve framework in several simulation case studies. Using this framework to analyze a real-world neonatal mouse heart cell dataset, the detected biomarkers may characterize novel subtypes of cardiac muscle cells.
Overall workflow to compute RMSD metric for one gene in one cluster of cells
Data points, histogram, and PGCs for cells expressing the gene (foreground) and all cells (background) are shown in cyan and red, respectively. PGC, polar Gini curve; RMSD, root mean square deviation.
Availability – The source code and data for Polar Gini Curve could be found at https://figshare.com/projects/Polar_Gini_Curve/76749.