A new statistical model from Xi’an Jiaotong-Liverpool University aims to solve the problem of isoform ambiguity related to RNA modifications.
The paper, published on 2 November in the leading journal Bioinformatics, reveals how the MetaTX model enables isoform-specific allocation of RNA modifications.
Isoform ambiguity has been a problem in the field of RNA modification sequencing technologies for a while, explains Dr Jia Meng, of the Department of Biological Sciences and the corresponding author of the paper.
“We have been aware of this problem since 2013, and several years ago, there were a number of research grants awarded to resolve it. In the end, no one published anything related to it. It’s a recurring problem, and no existing model seems to have solved it.”
RNA, like DNA, is a type of nucleic acid and a basic component of all life, so greater scientific understanding of how it works is important.
“For a long time, DNA took first place in biological research on nucleic acids. However, the discoveries of catalytic RNAs and functional noncoding RNAs over the past two or three decades have completely changed our views on the topic. RNA research has become one of the most dynamic and fast-growing fields in science,” says Dr Meng.
He explains that RNA modifications determine a function within the RNA.
“RNA modifications work like an identity badge – it can get you into places that others can’t go. With a certain modification, maybe the RNA can be transcribed into protein, or without it, it will be dragged to the garbage bin.”
For this reason, when researching RNA, it’s important to know where certain modifications come from, because this will indicate their functions. However, with current RNA sequencing technology, only the DNA-based coordinates can be mapped.
Because multiple different forms of RNA molecules (or isoform transcripts) can be produced from the same DNA template, a single DNA coordinate corresponds to multiple locations on multiple isoform transcripts. This is known as isoform ambiguity.
“RNA modifications are located on RNA, but you only get their projected coordinates on their DNA template, which means lots of information is lost. The purpose of this study is to counteract that information loss, and recover isoform-level coordinates of RNA modification,” says Dr Meng.
Isoform ambiguity and compositional diversity of mRNAs
Although physically located on the RNAs, many mRNA-related features are only recorded by genome-based coordinates, the transcript-level to which they belong remains unclear due to technical limitations. In the above example, the RNA modification site is denoted by genome-based coordinate, and overlaps with 4 isoform transcripts of the same gene. It may be associated with the 3’UTR of isoform 1, near the stop codon on the CDS of isoform 2, etc., which may cause problems when characterizing the distribution of this mRNA-related feature. Note that, isoform 1 has longer 3’UTR, isoform 2 has no 3’UTR, and isoform 3 has longer 5’ UTR, while isoform 4 has no 5’ UTR at all. The compositional difference may make it difficult to compare across multiple mRNAs of the same or different genes.
Comparing apples to apples
There were two important aspects when developing this solution, says Dr Meng: “Firstly, we needed to develop a framework to make different transcripts comparable, because some may be wider or narrower, and some may not even have some typical features.
“Secondly, and this is what makes the MetaTX model unique, is that we assumed a non-uniform distribution of RNA features, or a non-uniform distribution of the features on the entire RNA.” He explains that all existing models for isoform ambiguity in RNA sequencing assume a uniform distribution.
Dr Meng says that the new MetaTX model, being open source and freely available, can be adapted for other aspects of RNA research.
“We used it specifically for RNA modification research, but it can be applied to many different fields and different problems. There are so many possibilities, because this is a very general statistical framework for resolving ambiguity caused by heterogeneity.”
Availability – https://github.com/yue-wang-biomath/MetaTX.1.0
Source – Xi’an Jiaotong-Liverpool University