Single-cell RNA sequencing (scRNA-seq) provides high-resolution transcriptome data to understand the heterogeneity of cell populations at the single-cell level. Analysis of scRNA-seq data requires utilization of numerous computational tools. However, non-expert users usually experience installation issues and lack of critical functionality or batch analysis modes, and the steep learning curves of existing pipelines. Researchers at the University of Oslo have developed cellsnake, a comprehensive, reproducible and accessible single-cell data analysis workflow to overcome these problems. Cellsnake offers advanced features for standard users and facilitates downstream analyses in both R and Python environments. It is also designed for easy integration into existing workflows, allowing for rapid analyses of multiple samples. As an open-source tool, cellsnake is accessible through Bioconda, PyPi, Docker, and GitHub, making it a cost-effective and user-friendly option for researchers. By using cellsnake, researchers can streamline the analysis of scRNA-seq data and gain insights into the complex biology of single cells.
Overview of the scRNA-seq pipeline in cellsnake
(1) Cellsnake can accept the output files from cellranger in addition to raw expression matrix files if provided in an appropriate format. (2) QC is performed by filtering out MT-genes, doublets and cells with a low gene number as examples. Clustree is then used to find the optimal resolution for the dimensionality reduction. (3) Afterward, the dataset is normalized and scaled before the PCA analysis and visualized by UMAP or t-SNE. (4) To find the differences in gene expression levels within the dataset differential gene expression analysis is performed with several outputs such as heatmaps, dotplots and marker plots. (5) To get an even better insight into the dataset, the pipeline contains several functional analyses such as GO enrichment, KEGG pathway, gene set enrichment and CellChat. Metagenome analysis is also available if the input file from step 1 is the direct output from cellranger. This is done by using the metagenomics tool Kraken2.
Availability – The full source code and the pipeline is available on GitHub (github.com/sinanugur/cellsnake). Cellsnake is also available as a Bioconda package (anaconda.org/bioconda/cellsnake) and a Docker image (sinanugur/cellsnake:latest).