from the miRBase Blog – By Sam

miRBase 19 is now available, brought to you from the Benasque RNA meeting in the sunny Pyrenees, and with a slightly larger time gap than usual. In that extended time, we have added more than the usual number of new sequences — 3171 new hairpins and 3625 novel mature products, bringing the totals to 21264 and 25141 respectively in 193 species. As always, the full README file is available on the FTP site, along with downloadable files containing all data in various formats.

We have spent some time deleting misannotated sequences, and the deep sequencing read views will allow us to focus more on this — 133 entries are removed in this release, many from the rice miRNA complement. We have also cleaned-up a number of cases of duplicate entries mapping to a single genomic locus (some prompted by new genome assembly releases) and rationalised many miRNA names. This is therefore a good time to remind you that the names are meant to be useful, but are not formally stable, and shouldn’t be used to convey complex information. The miRNA accession numbers *do* remain stable between releases, and of course, you can always quote the sequence to be truly unambiguous.

In this release, the miR* nomenclature is finally retired for all species, as previously promised. For every hairpin and mature sequence, all IDs that have previously been used in miRBase are now visible on the entry pages, and are downloadable in bulk from the FTP site.

At the time of writing, we have not added new deep sequencing datasets to the read view pages — however, a decent sized update to that section will be coming along shortly, together with an announcement here.

As always, comments, questions, abuse, praise all welcome here or by email.

(read more at miRBase.org)

Incoming search terms:

  • mirbase statistics
  • how many human mrna sequence mirbase release 19
  • mirbase 19
  • mirbase gtf tophat
  • rat mirbase bed

Comments

One Response to “miRBase Version 19 is Released”

  1. jumo on August 2nd, 2012 5:18 am

    thanks!
    very good!

Leave a Reply




  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • RNAseq (SOLiD) from 18 - 200 nt June 18, 2013
      We are interested in small non-coding RNAs. Whomever you ask about the size range of small RNAs, you get a different answer. ;) Lets assume, small... […]
      GenomicIBK
    • Unmapped ratio very high on mouse genome June 17, 2013
      Hi, My problem regards RNA-Seq data. I've downloaded public data (SAGE libs w/ 6 different samples from mouse liver ) to analyse using ArrayStudio.... […]
      le.nono
    • RNASeq: Read length different from expected June 17, 2013
      Hello all, I have received paired-end reads for 40 samples. The reads are supposed to be 100bp per end. Instead, 20 of my samples are 101bp per... […]
      gogodidi
    • How to install xgawk June 16, 2013
      Hi, This is Shrujan, i have a problem while running RNA Sequencing QC. It shows an error that xgawk is not found. So please help me installing... […]
      shrujan
    • RNA Sequencing QC Error while using with Sequence_QC.sh file June 15, 2013
      Hi, This is Shrujan kumar Madadha, I had an error while running QC for Drosophila Yukuba fastq RNA file using Sequence_QC.sh file of FASTX... […]
      shrujan
    • Cuffmerge related query June 12, 2013
      I have a query regarding what samples should be merged using cuffmerge, when you have multiple phenotypes (each with replicates). Lets say my mouse... […]
      ParthavJailwala
  • RSS Biostar – RNA-Seq

    • Normalising tag count to RPKM
      Hi! I was wondering if their is a way to normalise the number of reads in a region and the RPKM of the nearest gene to that region, so that a correlation could be computed. Like the following data shows number of tags in first column and RPKM in second column Tags RPKM 15 0.14619 11 0 203 0.2259 129 10.701 300 7.0772 122 2.3234 346 10.666 77 3.117 201 16.749 […]
    • a simple question on RNA-Seq terminology
      This question may be very simple and basic, but I just need to confirm that I understand the differences among those terminologies in the RNA-Seq context. Suppose I have a sample called SLR, and it is sequenced on 5 lanes, so I have (among other output files) BAM files like L1_SLR, L2_SLR, L3_SLR, L5_SLR and L7_SLR.bam. Here, the letter "L" denotes […]
    • FInding regions of interest with minimum coverage
      Hi, I have a bam file of all my accepted hits (tophat output) and an gtf file with my genes of interest for which I am trying to find potential antisense transcripts. I would like to create a list - preferably one that can be visualized in a genome browser - that shows all genes that have antisense reads in the accepted hits.bam file provided that there are […]
    • How to remove the intronic reads before counting
      I got RNASeq data in several samples. I checked the FastQC, seems the read quality are good (Hiseq 2000). But the problem is many reads are mapped to intronic region, and the regions have no any reference exons there (Refseq, ensembl, gencode). We don't know what they are. We guess the problem happend in library preparation, the concentration was low. N […]
    • Which strand of the mRNA molecule does the sequencer output as a "read"?
      In Illumina Stranded RNA-Seq (using the dUTP method), do the final reads in the fastq files correspond to the initial molecule (that was transcribed), or to the reverse complement of the molecule? C […]
    • RNA-Seq: novel transcripts found. What next?
      If I were use cufflinks in de novo mode to find transcripts or genes in my data that did not align to known transcripts from UCSC or Ensembl, I wouldn't know what to do downstream of this. How would one go about confirming that these are indeed novel? What sort of validation steps would one take (computational or non-computational), what in-depth inform […]