DNA mutations are known cancer drivers. Memorial Sloan Kettering Cancer Center researchers investigated whether mRNA events that are upregulated in cancer can functionally mimic the outcome of genetic alterations. RNA sequencing or 3′-end sequencing techniques were applied to normal and malignant B cells from 59 patients with chronic lymphocytic leukaemia (CLL). They discovered widespread upregulation of truncated mRNAs and proteins in primary CLL cells that were not generated by genetic alterations but instead occurred by intronic polyadenylation. Truncated mRNAs caused by intronic polyadenylation were recurrent (n = 330) and predominantly affected genes with tumour-suppressive functions. The truncated proteins generated by intronic polyadenylation often lack the tumour-suppressive functions of the corresponding full-length proteins (such as DICER and FOXN3), and several even acted in an oncogenic manner (such as CARD11, MGA and CHST11). In CLL, the inactivation of tumour-suppressor genes by aberrant mRNA processing is substantially more prevalent than the functional loss of such genes through genetic events. The researchers further identified new candidate tumour-suppressor genes that are inactivated by intronic polyadenylation in leukaemia and by truncating DNA mutations in solid tumours. These genes are understudied in cancer, as their overall mutation rates are lower than those of well-known tumour-suppressor genes. These findings show the need to go beyond genomic analyses in cancer diagnostics, as mRNA events that are silent at the DNA level are widespread contributors to cancer pathogenesis through the inactivation of tumour-suppressor genes.
Hundreds of genes generate recurrent CLL-IPAs
a, Schematic showing full-length mRNA and protein expression in normal cells and the generation of a truncated mRNA and protein through cancer-specific IPA, despite no difference in DNA sequence. Polyadenylation sites (pA) are shown in light green. Loss of essential protein domains (dark green boxes) through cancer-gained IPA may inactivate TSGs, thus contributing to cancer pathogenesis. b, Representative CLL-IPAs (from n = 330) are shown. mRNA 3′ ends detected by 3′-seq are depicted as peaks, the heights of which correspond to transcript abundance shown in transcripts per million (TPM). The bottom panel shows RNA-seq reads and numbers correspond to read counts. Full-length and IPA-generated truncated proteins are depicted in grey, known domains are shown in green and the domains lost through IPA are named. For CLL-IPA, the number of retained and novel amino acids (aa) and amino acids of full-length proteins are given. CC, coil–coil; MemB, memory B cells, NB, naive B cells. c, Representative RNA-seq tracks from two independent CLL datasets are shown as in b; one is indicated by ‘L’ before the patient number (CLL-L14). B3 denotes donor 3. Zoomed-in view shows the exonized part of intron 23 of DICER1 (green). d, Difference in relative abundance (usage) of IPA isoforms between CLL and normal CD5+ B cells. A GLM was used to identify significant events. CLL-IPAs with significantly higher usage are shown in red (false discovery rate (FDR)-adjusted P < 0.1, usage difference ≥ 0.05, TPM in CD5+ B < 8) and CD5+ B-IPAs are shown in blue. Grey denotes IPAs present in CLL and CD5+ B cells without significantly different usage. e, Number of CLL-IPAs per sample is shown as box plots, in which the horizontal line denotes the median; boxes denote the 25th and 75th percentiles; error bars denote the range. CLL high, n = 21/59, median of CLL-IPAs/sample = 98 versus CLL low, n = 38/59, median = 29. ***P = 6 × 10−10, two-sided Mann–Whitney U-test.