Breast cancer is a complex disease, characterized by gene deregulation. There is less systematic investigation of the capacity of long intergenic non-coding RNAs (lincRNAs) as biomarkers associated with breast cancer pathogenesis or several clinicopathological variables including receptor status and patient survival. Researchers at the Vanderbilt University Medical Center designed a two-stage study, including 1,000 breast tumor RNA-seq data from The Cancer Genome Atlas (TCGA) as the discovery stage, and RNA-seq data of matched tumor and adjacent normal tissue from 50 breast cancer patients as well as 23 normal breast tissue from healthy women as the replication stage. The researchers identified 83 lincRNAs showing the significant expression changes in breast tumors with a false discovery rate (FDR) < 1% in the discovery dataset. Thirty-seven out of the 83 were validated in the replication dataset. Integrative genomic analyses suggested that the aberrant expression of these 37 lincRNAs was probably related with the expression alteration of several transcription factors (TFs). They observed a differential co-expression pattern between lincRNAs and their neighboring genes. They found that the expression levels of one lincRNA (RP5-1198O20 with Ensembl ID ENSG00000230615) were associated with breast cancer survival with P < 0.05. This study identifies a set of aberrantly expressed lincRNAs in breast cancer.
Specific expression of lincRNAs in breast cancer subtypes
(A) Heatmap of three lincRNAs specifically over-expressed in ER+ breast cancer. Red and green represent 664 ER+ and 196 ER- cancer samples from TCGA, respectively. Black bar denotes 85 adjacent normal tissues. Distribution of DNA binding by ERα in three lincRNA genes, (B) GATA3-AS1 (Ensemble ID ENSG00000197308), (C) RP11-279F6 (Ensemble ID ENSG00000245750) and (D) AC017048 (Ensemble ID ENSG00000224577). The gray bars represent the DNA binding enrichment for the ERα in the MCF-7 cells. The track in the top for each lincRNA is the chromatin states from the chromHMM algorithm in the HMEC cell line. Chromatin states with bright red and light red, orange and yellow, blue, green and grey represent active promoter and weak promoter, strong enhancer and weak/poised enhancer, insulator, transcriptional region and heterochromatin/low signal, respectively.