Cancer is a generic term for a large group of malignant diseases that can affect any part of the body. It is characterized by invasive, abnormal growth which grows beyond the usual boundaries and spreads to adjoining or distant organs (metastasis). It is important to analyze and characterize genomic structural, sequence and expression variations that cause cancer or are associated with cancer in order to advance our knowledge of the disease and to develop new treatments for cancer. The rapid development in high-throughput sequencing technologies is enabling the characterization of genome-wide alteration in cancer at single base resolution. It is now feasible to sequence entire cancer genomes from a large number of samples in a timely and cost-efficient manner. Characterizing genomic changes in cancer is the key to the discovery of novel therapeutic targets. It is imperative to have a large number of samples from cancer patients to comprehensively and accurately characterize the genomic changes to distinguish true driver mutations from background passenger mutations.
A major challenge in these studies is obtaining large numbers of fresh tissue samples that also have long-term follow up clinical information on disease progression and outcome.
Researchers at the University of Kansas Medical Center now show that Illumina RNA sequencing of formalin-fixed diagnostic tumor samples produces gene expression that is strongly correlated with matched frozen tumor samples (r > 0.89). In addition, sequence variations identified from FFPE RNA show 99.67% concordance with that from exome sequencing of matched frozen tumor samples. Because FFPE is a routine diagnostic sample preparation, the feasibility results reported here will facilitate the setup of large-scale research and clinical studies in medical genomics that are currently limited by the availability of fresh frozen samples.
Heat map of clustered correlation matrix (RNA-seq&NanoString data sets from paired FF&FFPE samples) shows strong correlation in gene expression between paired FF and FFPE samples. Color key was adjusted to minimal and maximal values to differentiate the differences. Dendrogram illustrates the relationship-distance between samples. Associated samples (e.g. FF2474 and FFPE2474) from the data sets produced with the same technology (either RNA sequencing or NanoString) are highlighted with a white frame, whereas associated samples from different technologies are highlighted with a blue frame. Sample NS_FF3356r1.NS_FF4079r1 was a replicate sample that was mislabeled as FF3356r1 during NanoString analysis. However, clustering analysis correctly identified it as FF4079r1. Note that correctly labeled FF3356r1 clustered with other FF3356 and FFPE3356 replicates.