Tumor heterogeneity—the complex microenvironment of different cell subpopulations within a single tumor—remains a major challenge in cancer research.
In the past, traditional RNA sequencing methods were limited to bulk gene expression profiles averaging thousands of cells; but the development of single-cell RNA sequencing technology has helped cancer biologists better understand the specific mechanisms that lead to tumor heterogeneity and drug resistance.
However, these large, complex datasets are often difficult to navigate.
Morgridge Postdoctoral Fellow Matthew Bernstein developed a web tool to explore these public datasets and facilitate analysis for cancer researchers.
The application called CHARacterizing Tumor Subpopulations, or CHARTS, was published in the journal BMC Bioinformatics earlier this year.
“Single cell genomics is kind of this massive wave of innovation in biotechnology and bioinformatics right now,” says Bernstein. “The pace of innovation is crazy, because there’s so many questions that people want to answer.”
Bernstein hopes that CHARTS will be a useful platform to help researchers easily cross-reference the public datasets in the NCBI’s Gene Expression Omnibus.
“I’m really interested in taking all this data and trying to combine it in new ways,” he says. “I want to apply new models to try and extract more knowledge.”
CHARTS serves as a quick hypothesis testing tool, answering questions such as: Is this gene expressed in these cell types for a specific type of cancer? Or, are these cell types present in malignant or benign tumors?
“I think there’s power in combining datasets together,” says Bernstein. “If you have just one dataset that could be adequate to answer that specific scientist’s question. But there might be other questions that require multiple datasets to be aggregated together.”
The framework for CHARTS was born out of an idea Bernstein had while attending a bioinformatics hackathon. He continued to build the platform with the encouragement of his mentor, Morgridge Investigator Ron Stewart.
A schematic diagram of the CHARTS pipeline
Public scRNA-seq data sets are collected and analyzed with a custom pipeline. This pipeline computes clusters, malignancy scores, dimension reduction transformations, cell type annotations, gene set enrichment scores, and differentially expressed genes for each cluster. Results are stored in a backend database and are accessed from the frontend web application
“You rely on these tools to understand human health and disease, and build things that affect people. So the tools have to be open-source,” Bernstein adds. “And if you want someone to actually use the thing you built, you have to make it really easy to use.”
Availability – CHARTS is freely available for researchers to use at charts.morgridge.org
Source – Morgridge Institute for Research