SpatialPrompt – spatially aware scalable and accurate tool for spot deconvolution and domain identification in spatial transcriptomics

In the world of biology, understanding the details of how cells function and interact within their natural environments is crucial. One of the most advanced techniques for this purpose is spatial transcriptomics, which allows scientists to map the activity of genes within tissue samples. However, this technique comes with its own set of challenges, particularly when it comes to efficiently identifying different cell types within these samples. Enter SpatialPrompt, a new tool designed to tackle these challenges head-on.

The Challenge of Mapping Cell Types

Spatial transcriptomics involves examining gene expression in different regions of a tissue sample. However, traditional methods for analyzing this data, known as spot deconvolution tools, often fall short in two significant ways:

  1. Ignoring Spatial Coordinates: Many existing tools do not take into account the spatial information of the cells. This means they miss out on understanding how the physical arrangement of cells affects their function and interaction.
  2. Performance Issues with Large Datasets: As datasets grow in size, these tools tend to slow down significantly, making it impractical to use them for large-scale studies.

Introducing SpatialPrompt

SpatialPrompt is a game-changing tool that addresses these issues by integrating both gene expression data and spatial location information. Here’s how it works and why it’s superior:

  1. Integration with scRNA-seq Data: SpatialPrompt uses single-cell RNA sequencing (scRNA-seq) data as a reference to accurately determine the proportion of different cell types in spatial spots. This helps in mapping cell types more precisely within the tissue sample.
  2. Advanced Computational Techniques: The tool employs non-negative ridge regression and graph neural networks. These techniques allow SpatialPrompt to efficiently capture local microenvironment information, leading to more accurate results.
  3. Speed and Efficiency: One of the standout features of SpatialPrompt is its speed. In extensive benchmarking analyses on various datasets like Visium, Slide-seq, and MERFISH, SpatialPrompt outperformed 15 existing tools. For example, on a mouse hippocampus dataset, it could complete spot deconvolution and domain identification in just 2 minutes for 50,000 spots. This speed is 44 to 150 times faster than current methods.

Overall workflow of SpatialPrompt

Fig. 1

SpatialPrompt framework takes as input the spatial matrix (Msp) with spatial coordinate information and single-cell RNA-seq (scRNA-seq) matrix (Msc) with cell-type annotations. a The custom spatial spot simulator utilises Msc and cell type annotations to generate simulated expression matrix (Msim) with known cell type proportion matrix (Ksim). Spatial data is simulated under three scenarios to mimic the real spatial data, b Msp and spatial coordinates is used to calculate the weighted mean expression (WME) from neighbours in same micro-environment for each spatial spot in Msp the matrix 𝑀𝑠𝑝𝑤 stores the WME value for nsp real spatial spots, c The non-negative ridge regression (NRR) model is built using integrated spatial matrix 𝑀𝑠𝑝𝑐𝑎𝑡. Next, the NRR model is employed to predict the WME for each simulated spot in Msim by utilising real spatial expression in Msp and its weighted mean neighbour expression in 𝑀𝑠𝑝𝑤. The integrated simulated matrix 𝑀𝑠𝑖𝑚𝑐𝑎𝑡 is obtained by combining Msim and 𝑀𝑠𝑖𝑚𝑤 column-wise, d For spatial deconvolution, KNN regressor model is trained on (𝑀𝑠𝑖𝑚𝑐𝑎𝑡) and Ksim. Then, this model predicts the cell type proportions in real spatial matrix (𝑀𝑠𝑝𝑐𝑎𝑡), e For domain identification, spatial clustering is performed using K-means algorithm on the integrated spatial matrix (𝑀𝑠𝑝𝑐𝑎𝑡). This figure created with BioRender.Com.

Superior Performance and a Vast Database

The efficiency and accuracy of SpatialPrompt are further enhanced by its integration with a curated database of over 40 scRNA-seq datasets. This seamless integration ensures that users can easily reference high-quality data for their analyses, leading to more reliable results.

Why SpatialPrompt Matters

The ability to quickly and accurately map cell types in situ has profound implications for research and medicine. For researchers, it means being able to understand the complex interactions within tissues at a much faster rate. For clinicians, it could lead to better diagnostic tools and more personalized treatments, as understanding the cellular makeup of tissues can inform on various diseases and conditions.

SpatialPrompt represents a significant advancement in the field of spatial transcriptomics. By combining cutting-edge computational techniques with a vast, integrated database, it offers unprecedented speed and accuracy in mapping cell types within tissue samples. This tool not only addresses the limitations of previous methods but also opens new avenues for research and clinical applications. As we continue to explore the complexities of cellular environments, tools like SpatialPrompt will be indispensable in our quest for deeper understanding and better healthcare solutions.

Availability – SpatiaPrompt Python package and scripts used for benchmarking analysis are available on the GitHub: https://github.com/swainasish/SpatialPrompt and at Zenodo: (https://doi.org/10.5281/zenodo.11070217).

Swain AK, Pandit V, Sharma J, Yadav P. (2024) SpatialPrompt: spatially aware scalable and accurate tool for spot deconvolution and domain identification in spatial transcriptomics. Commun Biol 7(1):639. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.