Coexpression enhances cross-species integration of single-cell RNA sequencing across diverse plant species

Single-cell RNA sequencing (scRNA-seq) is a powerful technique that allows scientists to study gene expression in individual cells. This technology is incredibly useful for understanding the differences in gene activity and cell types between different species. By examining these differences at a single-cell level, researchers can gain insights into the complexity and diversity of life, especially in plants.

Challenges in Plant Genomics

However, studying plants with scRNA-seq presents unique challenges. Plants often undergo whole-genome duplications, meaning they can have multiple copies of many genes. This makes it hard to identify one-to-one orthologues—genes in different species that evolved from a common ancestral gene and retain the same function. Instead, researchers often find many-to-many orthology relationships, where one gene in one species corresponds to several genes in another species.

The Role of Coexpression in Gene Identification

To overcome this challenge, scientists at the Cold Spring Harbor Laboratory have developed a method using coexpression to narrow down many-to-many orthology families to identify one-to-one gene pairs. Coexpression refers to genes that are expressed at the same time and under the same conditions. By looking at genes that are coexpressed, researchers can find those with similar expression profiles across different species, even if the genes belong to large families with many duplicates.

Improving Integration Methods

Once the one-to-one gene pairs are identified through coexpression, traditional integration methods can be used more effectively. Integration methods help combine data from different species, allowing scientists to compare and contrast gene expression patterns and cell types. By refining the identification of orthologues, the integration becomes more accurate and meaningful, making it easier to study the diversity of plant species.

Coexpression proxies integrate a split dataset without shared genes

Fig. 1

a, Schematic depicting the identification of coexpression proxies from gene orthology information and their use in expanding the gene space to enable integration followed by identification of novel and conserved cell types. b, Gene expression profile for target gene (AT1G16160) and two potential coexpression proxies (AT1G16150, AT4G31100). The gene with the more similar profile, AT1G16150, was identified as a coexpression proxy, while AT4G31100 was rejected. The centre band is the mean counts per million (CPM) for each gene in the cell type in our single-cell dataset. The error bar is the 95% confidence interval. QC, quiescent center. c, UMAP showing integration of a split and dissociated A. thaliana dataset containing 16,636 cells using coexpression proxies. d, UMAP showing integration of the same dataset using the worst potential coexpression proxy from each gene family. e, UMAP showing the failed integration of the split and dissociated dataset using 1,900 random gene pairs. f, Euclidian distance from the expression profile of the target gene for n = 117 pairs of accepted coexpression proxies and rejected coexpression proxies in independent cell types, split by expression quartile of the target gene. The bottom of the box is the lower quartile, the top of the box is the upper quartile and the centre bar is the median. The whiskers are 1.5 times the interquartile range. g, Heat map showing the number of identified coexpression proxies between each species pair in the database.

Benefits for Plant Research

This approach has significant implications for plant research. By improving the ability to integrate scRNA-seq data across different plant species, scientists can better understand how plants adapt to their environments, how different species evolved, and how to develop crops with desirable traits. This method reduces the barriers to cross-species integration, opening up new possibilities for comparative plant genomics.

Conclusion

Single-cell RNA sequencing is a valuable tool for studying gene expression and cell types in plants. The frequent expansion of plant gene families due to whole-genome duplications makes identifying one-to-one orthologues challenging. However, by using coexpression to find one-to-one gene pairs, researchers can improve the performance of traditional integration methods, making it easier to study the diversity of plant species. This advancement holds great promise for enhancing our understanding of plant biology and improving agricultural practices.

Passalacqua MJ, Gillis J. (2024) Coexpression enhances cross-species integration of scRNA-seq across diverse plant species. Nat Plants [Epub ahead of print]. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.