Sequencing-based RNA structure probing can generate transcriptome-wide profiles of RNA secondary structures. Sufficient structural coverage is needed to obtain unbiased insights about RNA structures and functions, yet probing methods often yield uneven coverage, with missing structural scores across many transcripts. To overcome this barrier, Tsinghua University researchers have developed StructureImpute, a deep learning framework inspired by depth completion from computer vision that integrates an RNA sequence with available RNA structural information of neighbouring nucleotides to infer missing structure scores. The researchers demonstrate the strong imputation performance of StructureImpute, with accuracy much superior to predictions based on RNA sequence alone. They also show that StructureImpute reliably reconstructs RNA structural patterns at biologically impactful RNA regulation regions, including protein-binding and RNA-modification sites. Strikingly, StructureImpute can use transfer learning to apply a model trained on one dataset to accurately infer missing structural scores in other datasets, even if they were generated with different technologies (for example, icSHAPE and DMS-seq).
The overall architecture of StructureImpute for RNA structural score imputation
StructureImpute consists of two input branches: an RNA sequence branch and a structural score branch. Each branch connects with a convolution layer, a residual block and an LSTM block, then merged into a fully connected layer by element-wise multiplication. Finally the output is transformed with a sigmoid function. ReLU, rectified linear unit.
Availability – Code used for training models and performing analyses are available from GitHub (https://github.com/Tsinghua-gongjing/StructureImpute)