Starting Small – RNA sequencing shows great potential for the discovery of microRNAs

from Drug Discovery & Development

As researchers work out the kinks in the process, RNA sequencing shows great potential for the discovery of microRNAs.

miRBaseRNA Sequencing (RNA-Seq) has quickly become the method of choice for discovery of new microRNAs (miRNAs) and other forms of small RNAs. While microarrays retain their place as a workhorse of expression profiling, researchers with an eye on discovery are advancing from previous methods such as direct cloning, tiling arrays, or purely computational methods of discovery to RNA-Seq and are excited about the results next-gen sequencing has produced. While RNA-Seq has demonstrated advantages over conventional methods of small RNA analysis, researchers still need to overcome some of the unique challenges it presents such as the handling of tremendous amounts of data. Despite this, miRNA discovery continues to advance and with the advent of RNA-Seq, shows no signs of slowing.

Transcriptomics—the profiling of the transcriptome—aims to catalog the complete set of RNA transcripts produced by the genome, including mRNAs, non-coding RNAs, miRNAs, and other small RNAs. RNA-Seq is a powerful tool for providing fresh biological insight into the transcriptome. It can be used to determine the structure of genes, their splicing patterns, and other post-transcriptional modifications; detect rare and novel transcripts; and quantify the changing expression levels of each transcript during development and under different disease conditions.

RNA-Seq has several key advantages over more traditional transcriptome analysis methods, such as microarrays. RNA-Seq enables profiling of any species of transcript within a total RNA sample, thus providing a more comprehensive view of the transcriptome. It is not necessarily dependent on any prior sequence knowledge, so there is no need for the design of probes that must be based on prior sequence or secondary structure information. This means transcriptome profiling is possible in any species, making this method particularly attractive for non-model species. RNA-Seq enables “digital” transcript expression analysis, meaning that expression-level data are based on each individual transcript that is sequenced and counted. By increasing the sequencing depth, a potentially unlimited dynamic range can be reached making RNA-Seq an ideal tool for the detection of rare transcripts. Finally, RNA-Seq provides information of sequence variation in transcripts, including information about post-transcriptional mutations and their genomic context.

MiRNA is a component of the transcriptome that has generated considerable interest lately as possible biomarkers or drug targets due to their extensive role in biological processes and cell functionality in normal versus diseased cells.

The miRBase sequence database is the primary public repository for newly discovered miRNAs and the number of miRBase entries has grown rapidly from 218 in 2002 to over 20,000 in the latest version, suggesting the existence of many more miRNAs yet to be discovered.1 Contributing to the rapid rate of discoveries has been the advancement of RNA-Seq technology. RNA-Seq, by its very nature, requires no prior sequence knowledge; in addition, its tunable dynamic range, increased sensitivity, and the digital aspect of reading every nucleotide of every transcript in a given sample make it an ideal discovery tool.

Microarrays, however, remain useful and accurate profiling tools for measuring expression levels of miRNA and other RNA transcripts under different developmental or disease conditions. Although they have the same limitations common to all hybridization-based methods, microarrays have proven to be a reliable, robust workhorse for decades, capable of delivering high-throughput expression data rapidly, reproducibly, and cost effectively.

Despite the excitement and the rapid transition from microarray to RNA-Seq technology, researchers have come to the realization that there are still issues to overcome. First, cost is not declining as rapidly as predicted. Most agreed that the rapid pace of advancement in sequencing technology and efficiency would surely result in a rapid price reduction. However, the cost remains relatively high compared to microarrays. In response, many have resorted to designing experiments with lower numbers of samples (replicates) presuming that using advanced technology could somehow compensate for the poor statistical design of the experiment. This is of course, not true; the same statistical rules apply for sequencing as they do for microarrays. Importantly, there is no statistical significance for a difference observed between just two samples.

Another challenge facing RNA-Seq users is the complexity of library preparation procedures and the debate over whether this multitude of steps introduces bias in expression results. For miRNA, sequencing has uncovered the existence of isomiRs—isoforms of consensus miRNA sequences that were not detectable by microarray. Because the sequences of the isomiRs differ by as little as one nucleotide located at the very end of the sequence, most thought that the microarrays simply could not resolve these different species.2 But now, many wonder if the isomiRs are in fact not just an artifact of the sequencing library preparation process. There is hope that advancements in library preparation kits will eventually iron out these issues. Changing the library preparation process, however, causes difficulties in comparing results between experiments as the processes change over time. In response, some laboratories have resorted to brewing their own kits.

Perhaps the largest challenge faced by scientists performing RNA-Seq experiments is data analysis. Most agree it’s the bottleneck of not just RNA sequencing, but of next-gen sequencing in general. There are issues unique to sequencing that have not been encountered before with microarrays; these must be dealt with. For example, the amount of data generated by sequencing just a single sample is staggering, making the handling of full experiment’s worth of data problematic. While improvements in computing power have begun to minimize this issue, the procedures for processing the data remain in flux. There is no consensus or standardized workflow for filtering, normalizing, alignment, and statistical analysis of these large, often complex data sets.

RNA-Seq is a powerful new tool that provides significant advantages over previous methods. Already, RNA-Seq has advanced the study of miRNA through the discovery of previously undetected sequences. There is much excitement about the results generated so far and the potential exists for more discovery in the near future. As with most breakthrough technologies, however, there remains an adjustment period as kinks are worked out and processes are streamlined and standardized.


  1. Kozomara A, Griffiths-Jones S. miRBase: integrating microRNA annotation and deep-sequencing data. NAR. 2011; 39(Database Issue):D152-D157. [article]
  2. Pritchard CC, Cheng HH, Tewari M. MicroRNA profiling: approaches and considerations. Nat Rev Genet. 2012 13(5):358-69. [abstract]

(read more…)