Single-cell RNA sequencing (scRNA-seq) provides an unprecedented view of cellular diversity of biological systems. However, researchers from University of Colorado School of Medicine estimate that, across the thousands of publications and datasets generated using this technology, only a minority (<25%) of studies provide cell-level metadata information containing identified cell types and related findings of the published dataset. Metadata omission hinders reproduction, exploration, validation, and knowledge transfer and is a common problem across journals, data repositories, and publication dates. The authors encourage investigators, reviewers, journals, and data repositories to improve their standards and ensure proper documentation of these valuable datasets.
Processed data files necessary for replicating single-cell studies
(A) Example of a gene-by-cell count matrix containing single-cell measurements and a cell-level metadata table containing annotations inferred from the analysis of the single-cell dataset. (B) Workflow of analysis steps for regenerating cell type or gene-expression signatures from public datasets for comparative analysis of single-cell datasets. * indicates a step requiring an analyst to make subjective decisions; ** indicates a step that often includes a nondeterministic algorithm.