The expanding field of epitranscriptomics might rival the epigenome in the diversity of biological processes impacted. In recent years, the development of new high-throughput experimental and computational techniques has been a key driving force in discovering the properties of RNA modifications. Machine learning applications, such as for classification, clustering or de novo identification, have been critical in these advances. Nonetheless, various challenges remain before the full potential of machine learning for epitranscriptomics can be leveraged.
In this review, Researchers from the Australian National University provide a comprehensive survey of machine learning methods to detect RNA modifications using diverse input data sources. They describe strategies to train and test machine learning methods and to encode and interpret features that are relevant for epitranscriptomics. Finally, the researchers identify some of the current challenges and open questions about RNA modification analysis, including the ambiguity in predicting RNA modifications in transcript isoforms or in single nucleotides, or the lack of complete ground truth sets to test RNA modifications. They believe this review will inspire and benefit the rapidly developing field of epitranscriptomics in addressing the current limitations through the effective use of machine learning.
Transcriptome-wide prediction of chemical messenger RNA modifications with ML
The identification of chemical modifications in messenger RNA (mRNA) involves (A) reading sequence information alone or in combination with experimental data, (B) training and testing of ML methods and (C) analysis of the predicted outputs in terms of properties such as the localization of the RNA modifications, association with specific mRNA isoforms, stoichiometry and functional characterization. XGBoost, eXtreme Gradient Boosting; LSTM, long short-term memory.