Long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine

Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Researchers at ETH Zurich performed long-read single-cell RNA sequencing (scRNA-seq) on clinical samples from three ovarian cancer patients presenting with omental metastasis and increased the PacBio sequencing depth to 12,000 reads per cell. This approach captures 152,000 isoforms, of which over 52,000 were not previously reported. Isoform-level analysis accounting for non-coding isoforms reveals 20% overestimation of protein-coding gene expression on average. The researchers also detected cell type-specific isoform and poly-adenylation site usage in tumor and mesothelial cells, and found that mesothelial cells transition into cancer-associated fibroblasts in the metastasis, partly through the TGF-β/miR-29/Collagen axis. Furthermore, the researchers identified gene fusions, including an experimentally validated IGF2BP2::TESPA1 fusion, which is misclassified as high TESPA1 expression in matched short-read data, and call mutations confirmed by targeted NGS cancer gene panel results. With these findings, these researchers envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine.

Study design and long-read data overview

Fig. 1

a Schematic of freshly processed HGSOC omentum metastases and patient-matched tumor-free distal omentum tissue biopsies, scRNA-seq. b Definition of SQANTI-defined isoform structural categories. c Proportions of isoform structural categories detected in merged metastasis and distal omentum samples. Percentage and total number of isoforms per category are indicated. d Proportions of unique reads attributed to isoforms detected in (c). Percentage and total number of UMIs per category are indicated. e Percentage of isoforms for which transcription start site is supported by CAGE (FANTOM5) data and transcription termination site is supported by polyadenylation (PolyASite) data, per isoform structural categories. “GENCODE.all” indicates all protein-coding isoforms in the GENCODE database, “GENCODE.FL” is a subset of ‘GENCODE.all’ containing only isoforms tagged as full-length, and “GENCODE.MANE” is a hand-curated subset of canonical transcripts, one per human protein-coding locus. f GENCODE-defined biotype composition of novel isoforms. g Biotype composition of the GENCODE database.

Dondi A, Lischetti U, Jacob F, Singer F, Borgsmüller N, Coelho R; Tumor Profiler Consortium; Heinzelmann-Schwarz V, Beisel C, Beerenwinkel N. (2023) Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing in ovarian cancer. Nat Commun 14(1):7780. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.