MicroRNAs (miRNAs) are small non-coding RNAs that are key players in the regulation of gene expression. In the last decade, with the increasing accessibility of high-throughput sequencing technologies, different methods have been developed to identify miRNAs, most of which rely on pre-existing reference genomes. However, when a reference genome is absent or is not of high quality, such identification becomes more difficult.
In this context, researchers at the Université de Lyon developed BrumiR, an algorithm that is able to discover miRNAs directly and exclusively from sRNA-seq data. They benchmarked BrumiR with datasets encompassing animal and plant species using real and simulated sRNA-seq experiments. The results demonstrate that BrumiR reaches the highest recall for miRNA discovery, while at the same time being much faster and more efficient than the state-of-the-art tools evaluated. The latter allows BrumiR to analyze a large number of sRNA-seq experiments, from plants or animals species. Moreover, BrumiR detects additional information regarding other expressed sequences (sRNAs, isomiRs, etc.), thus maximizing the biological insight gained from sRNA-seq experiments. Finally, when a reference genome is available, BrumiR provides a new mapping tool (BrumiR2ref) that performs an a posteriori exhaustive search to identify the precursor sequences.
BrumiR algorithm
Different steps of BrumiR to discover miRNAs from sRNA-seq data. 1.1 De Bruijn graph step, 1.2 Tips removal iterative step, 1.3 Delete neighbor connection step, 1.4 Tips removal step repetition, 1.5 Topology analysis step, 1.6 Re-assembling unipaths by CC step, 1.7 Re-clustering by overlap step, 1.8 Filtering other sRNAs by RFAM step, 1.9 BrumiR candidates catalog.
Availability – The code of BrumiR is freely available at https://github.com/camoragaq/BrumiR.