While the large proportion of our genome that does not instruct our cells to form proteins has been harder to study than protein-coding genes, it has been shown to have vital physiological functions. Scientists at Karolinska Institutet in Sweden have now developed new high-precision tools able to identify what these noncoding sequences do. The study, which is published in the journal Nature Genetics, may eventually contribute to the development of new, targeted drugs.
Only a small proportion of our genome comprises genes that instruct the cells to make specific proteins; the majority are so-called noncoding DNA, sometimes called “junk DNA” given that their function has been largely unknown. However, recent research has shown that some of these sequences can give rise to RNA which affects vital cellular processes. It transpires that most of the genetic changes linked to different diseases lie in these very noncoding parts of patients’ DNA.
“This has come as a great surprise and we now need to understand in detail how these genetic changes affect different diseases in order to eventually be able to develop more accurate drugs,” says the study’s first author Per Johnsson, researcher at the Department of Cell and Molecular Biology, Karolinska Institutet. “Generally speaking we don’t know that much about this interaction, but we believe that noncoding RNA will one day be a source of attractive drug candidates. It’s therefore extremely important that we speed up the characterisation of these RNA molecules.”
For this study, the researchers combined single-cell sequencing with mathematical calculations to show that it is possible in this way to identify the function of noncoding RNA, something that has proved very difficult before. Using these tools, they were then able to identify an entirely new mechanism for how the RNA molecules regulate the activity of protein-coding genes in their vicinity.
Identification of cell cycle regulated lncRNAs using scRNA-seq
a, Boxplots showing the normalized expression levels of cell cycle marker genes in cells classified to the cell cycle phase (n = 533 cells, the center lines show the medians, the interquartile limits indicate the 25th and 75th percentiles and the whiskers denote the farthest points at a maximum of 1.5 times the IQR, colored according to the cell cycle phase). b, Scatterplots showing lncRNAs with significant expression differences across cell cycle phases (y axis, Benjamini–Hochberg-adjusted ANOVA) against the fold induction (x axis) compared to the other cell cycle phases. The top ranked candidates selected for further validation are colored red. c, Relative expression levels of candidate lncRNAs in lentiviral transduced NIH/3T3 cells measured by RT–qPCR. d, Quantification of colony-forming cells in shControl cells and cells with stable shRNA-induced knockdown of lncRNAs, together with representative photos of staining (whole-well images). e, Relative expression of cell cycle-associated lncRNAs on siRNA-induced knockdown, measured by RT–qPCR. f, Quantification of colony-forming cells on siRNA-induced knockdown for candidate lncRNAs. c–f, n = 3–4 biologically independent samples, data presented as mean values ± s.e.m. and the P values represent a two-sided Student’s t-test.
“After many years of development, single-cell sequencing has now reached a stage where we can isolate individual cells and study regulating mechanisms with high precision,” says principal investigator Rickard Sandberg, professor at the Department of Cell and Molecular Biology, Karolinska Institutet. “This is multidisciplinary research that we believe will contribute significantly to our basic understanding of cell biology and that, in the long run, can give us new insights into how cellular function can be influenced through the agency of small drug substances.”
The group has so far used the method to study the function of a handful of noncoding RNA molecules, but there are thousands of similar molecules waiting to be characterised. They now plan to do similar work on RNA molecules with a possible role in the development of disease, such as cancer.
“We’ll be applying larger-scale methods to study hundreds to thousands of similar genes in parallel, thus greatly advancing our understanding of these interesting RNA molecules,” says Dr Johnsson.
Source – Karolinska Institutet
Availability – The R code used to reproduce and plot the major findings has been made available at https://github.com/sandberg-lab/lncRNAs_bursting