For most organisms, DNA sequences are available, but the complete RNA sequences are not. Here, we call for technologies to sequence full-length RNAs with all their modifications.
RNA determines cell identity and mediates responses to cellular needs. Such diverse cellular functions arise from the vast chemical composition of RNA comprising four canonical ribonucleotides (A, C, G and U) and more than 140 modified ribonucleotides. Many years of RNA research laid the foundation for the development of RNA therapeutics as diverse as antisense oligonucleotide therapy for spinal muscular atrophy, and mRNA vaccines. These remarkable accomplishments were enabled by modified ribonucleotides, yet the ‘true’ sequence of RNA, i.e., the ‘RNome’, remains unknown. This key knowledge gap in understanding the building blocks of RNA must be filled. Here, we call for the development of high-throughput methods to sequence RNA directly on a transcriptome-wide scale and the necessary informatics to identify all RNA variants at the single-molecule level.
Chemical modifications of RNA
Of the more than 140 different modifications that occur in all types of RNAs, approximately ten can be mapped to specific sequence contexts through various methods discussed in this Comment. Methods are needed that can detect and quantify all the modifications to obtain complete RNA sequences. Modification nomenclature is as described in Modomics, http://genesilico.pl/modomics/.