A team led by researchers at the Fred Hutchison Cancer Research Center considers an increasingly popular study design where single-cell RNA-seq data are collected from multiple individuals and the question of interest is to find genes that are differentially expressed between two groups of individuals. Towards this end, the researchers propose a statistical method named IDEAS (individual level differential expression analysis for scRNA-seq). For each gene, IDEAS summarizes its expression in each individual by a distribution and then assesses whether these individual-specific distributions are different between two groups of individuals. The researchers apply IDEAS to assess gene expression differences of autism patients versus controls and COVID-19 patients with mild versus severe symptoms.
An overview of the IDEAS pipeline
A toy example with 2 cases and 3 controls, with 2 or 3 cells per individual. For each gene, the gene expression distribution within each individual (e.g., P(x) and Q(x) in the figure) is summarized and then the distance of such distributions between any two individuals is calculated and a distance matrix (the bottom right corner of the figure) is obtained. This distance matrix, combined with additional individual-level covariates, is used for differential expression analysis between cases and controls