The advent of single-cell RNA sequencing (scRNAseq) and additional single-cell omics technologies have provided scientists with unprecedented tools to explore biology at cellular resolution. However, reaching an appropriate number of good quality reads per cell and reasonable numbers of cells within each of the populations of interest are key to infer relevant conclusions about the underlying biology of the dataset. For these reasons, scRNAseq studies are constantly increasing the number of cells analysed and the granularity of the resultant transcriptomics analyses.
Researchers at the Biodonostia Health Research Institute aimed to identify previously described fibroblast subpopulations in healthy adult human skin by using the largest dataset published to date (528,253 sequenced cells) and an unsupervised population-matching algorithm. Their reanalysis of this landmark resource demonstrates that a substantial proportion of cell transcriptomic signatures may be biased by cellular stress and response to hypoxic conditions.
Stress and hypoxia-related signatures in published human dermal fibroblast datasets
(A) UMAP plot of normal fibroblasts (after removal of hypoxic and stressed cell subsets) reveals conservation of some, but not all, cell types previously described in independent datasets. (B) UMAP plot of human dermal fibroblast subsets are shown here for five published datasets, and depicted by the average levels of expression of stress and hypoxia gene signatures.
The researchers postulate that careful design of experimental conditions is needed to avoid long processing times of biological samples. Additionally, computation of large datasets might undermine the extent of the analysis, possibly due to long processing times.