Recent development of spatial transcriptomic technologies has made it possible to systematically characterize cellular heterogeneity while preserving spatial information, which greatly enables the investigation of structural organization of a tissue and its impact on modulating cellular behavior. On the other hand, the technology often does not have sufficient resolution to distinguish neighboring cells which may belong to different cell types, therefore it is difficult to identify cell-type distribution directly from the data.
To overcome this challenge, Harvard Medical School researchers have developed a computational method, called spatialDWLS, to quantitatively estimate the cell-type composition at each spatial location. The researchers benchmarked the performance of spatialDWLS by comparing with a number of existing deconvolution methods using both real and simulated datasets, and they found that spatialDWLS outperformed the other methods in terms of accuracy and speed. By applying spatialDWLS to analyze a human developmental heart dataset, the researchers observed striking spatial-temporal changes of cell-type composition which becomes increasing spatially coherent during development. As such, spatialDWLS provides a valuable computational tool for faithfully extracting biological information from spatial transcriptomic data.
An overview of the spatialDWLS method
a. A schematic representation of the spatialDWLS workflow. The input contains a spatial transcriptomic dataset (gene expression matrix and cell location coordinates) and a set of known cell-type specific gene signatures. For each spot, the cell types that are likely to be present are identified by using cell-type enrichment analysis. Then, a modified DWLS method is applied to infer cell type position at each spot. b. Comparison of the accuracy of different deconvolution methods. Single-cell resolution seqFISH+ data are coarse-grain averaged to generate lower-resolution spatial transcriptomic data. The true frequency of a cell-type (indicated as blue squares in the top left panel) at each spot is compared with the inferred frequency (indicated as red squares in the five other panels) by using different methods. The relationship is also represented as a scatter plot, with x-axis representing the true frequency and the y-axis representing the inferred frequency. The overall performance is quantified as the root mean square error (RMSE). The oligodendrocyte cell-type is used here as a representative example. c. The overall RMSE error is further decomposed into two components, corresponding to regions where the cell type is absent (red) and present (green), respectively. d. Comparison of the computing speed of different methods. Running times for analyzing a mouse brain Visium dataset are shown.
Availability – All codes, data, and analysis results in this paper are publicly available at GitHub: https://github.com/rdong08/spatialDWLS_dataset