Fundamental aspects of SARS-CoV-2 biology remain to be described, having the potential to provide insight to the response effort for this high-priority pathogen.
University of Melbourne researchers describe the first native RNA sequence of SARS-CoV-2, detailing the coronaviral transcriptome and epitranscriptome, and share these data publicly. A data-driven inference of viral genetic features and evolutionary rate is also made. The rapid sharing of sequence information throughout the SARS-CoV-2 pandemic represents an inflection point for public health and genomic epidemiology, providing early insights into the biology and evolution of this emerging pathogen.
Breakpoint analysis of the SARS-CoV-2 transcriptome
Direct RNA reads carrying a breakpoint relative to the 5’ leader sequence are shown, representing potentially viable transcripts. These breakpoints are localised at the same position on the leader sequence (positions 62-68), and on the 3’ to predicted transcription regulating sequences in the body of the genome (TRS-Bs, highlighted by vertical weight lines), generating common subgenomic mRNAs. Of note, many low frequency breakpoints are detected, although few near the sequence currently annotated as ORF10. The key shows the distribution of transcript breakpoints. Colour is matched to a ‘value’ measuring the number of reads with break points at that position, log10-scaled. The histogram component illustrates the number of transcripts with a given abundance value.