It is widely believed that tertiary nucleotide-nucleotide interactions are essential in determining RNA structure and function. Currently, direct coupling analysis (DCA) infers nucleotide contacts in a sequence from its homologous sequence alignment across different species. DCA and similar approaches that use sequence information alone typically yield a low accuracy, especially when the available homologous sequences are limited. Therefore, new methods for RNA structural contact inference are desirable because even a single correctly predicted tertiary contact can potentially make the difference between a correct and incorrectly predicted structure.
Researchers from the George Washington University and Central China Normal University have developed a new method DIRECT (Direct Information REweighted by Contact Templates) that incorporates a Restricted Boltzmann Machine (RBM) to augment the information on sequence co-variations with structural features in contact inference.
Benchmark tests demonstrate that DIRECT achieves better overall performance than DCA approaches. Compared to mfDCA and plmDCA, DIRECT produces a substantial increase of 41 and 18%, respectively, in accuracy on average for contact prediction. DIRECT improves predictions for long-range contacts and captures more tertiary structural features.
Basic workflow of DIRECT for RNA tertiary contact prediction
a The corresponding RNA multiple sequence alignment (MSA) is extracted from the Rfam database. The traditional direct-coupling analysis (DCA) predicts the tertiary contacts from sequence coevolution in MSA. b DIRECT then reweighs the contacts by using structural templates trained by Restricted Boltzmann Machine (RBM). c The reweighted contact prediction leads to better overall performance
Availability – The codes and dataset are available at https://zhaolab.com.cn/DIRECT/.