mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Stanford University researchers describe design principles that overcome these barriers. The researchers have developed an RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. The researchers found that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. They further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. These findings show that highly structured “superfolder” mRNAs can be designed to improve both stability and expression with further enhancement through pseudouridine nucleoside modification. Together, this study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines.
PERSIST-seq overview and illustrative ribosome load insights
a Overview of the mRNA optimization workflow. Literature mined and rationally designed 5′ and 3′ UTRs were combined with Eterna and algorithmically designed coding sequences. All sequences were then experimentally tested in parallel for in-solution and in-cell stability as well as ribosome load. The mRNA design included unique, 6–9 nt barcodes in the 3′ UTR for tag counting by short-read sequencing. b Experimental design for testing in-solution and in-cell stability and ribosome load in parallel. mRNAs were in vitro transcribed, 5′ capped, and polyadenylated in a pooled format before transfection into HEK293T cells or being subjected to in-solution degradation. Transfected cells were then harvested for sucrose gradient fractionation or in-cell degradation analysis. c Polysome trace from transfected HEK293T cells with 233-mRNA pool. d 5′ UTR variants display a higher variance in mean ribosome load per construct as determined from polysome sequencing. The formula for ribosome load is given. Box hinges: 25% quantile, median, 75% quantile, respectively, from left to right. Whiskers: lower or upper hinge ±1.5 x interquartile range. e Heatmaps from polysome profiles of mRNA designs selected from the top, middle, and bottom five mRNAs (by ribosome load) from each design category. f Secondary structure model of the SARS-CoV-2 5′ UTR. Introduced mutations and substitutions are highlighted. g Heatmaps of SARS-CoV-2 5′ UTR variants’ polysome profiles sorted by ribosome load.