A key aspect of RNA secondary structure prediction is the identification of novel functional elements. This is a challenging task because these elements typically are embedded in longer transcripts where the borders between the element and flanking regions have to be defined. The flanking sequences impact the folding of the functional elements both at the level of computational analyses and when the element is extracted as a transcript for experimental analysis.
Here, researchers from the University of Copenhagen analyze how different flanking region lengths impact folding into a constrained structure by computing probabilities of folding for different sizes of flanking regions. Their method, RNAcop (RNA context optimization by probability), is tested on known and de novo predicted structures. In vitro experiments support the computational analysis and suggest that for a number of structures, choosing proper lengths of flanking regions is critical.
An example for an expected improvement by adding flanking regions to a wcaG family sequence (RF01761, AACY020337922.1). (A) Maximum expected accuracy (MEA) structures predicted with RNAfold (20). (B) Agreement with the constrained structures. For paired nucleotides probabilities to be paired are depicted and probabilities to be unpaired for unpaired nucleotides. (C) corresponding sequences. Nucleotides in gray indicate flanking regions. ‘Unextended’ and ‘predicted high probability’ refer to the sequence without flanking nucleotides and flanking regions that lead to a high probability of observing the consensus structure, respectively.
Availability – RNAcop is available as web server and stand-alone software via http://rth.dk/resources/rnacop