Apr
2
MicroRNA discovery by similarity search to a database of RNA-seq profiles
Filed Under Databases, Other Tools | Leave a Comment
In silico generated search for microRNAs (miRNAs) have been driven by methods compiling structural features of the miRNA precursor hairpin as well as to some degree combining this with analysis of RNA-seq profiles for which the miRNA typically leave the drosha/dicer fingerprint of 1-2 ~22nt blocks of reads corresponding to the mature and star miRNA.
In complement to the previous methods, researchers at the University of Copenhagen, Denmark present a study where they systematically exploit these pattern of read profiles. They created databases of 2,540 miRNA read profiles using short RNA-seq data from miRBase and 4,795 read profiles from ENCODE (after preprocessing). Of the 4,795 ENCODE profiles, 1,361 are annotated as noncoding RNAs (ncRNAs) and of which 285 are further annotated as miRNAs. Using \prog{deepBlockAlign} (dba), they align ENCODE ncRNA profiles against the miRBase profiles (cleaned for “self-matches”) and are able to separate ENCODE miRNAs from the other ncRNAs by a Matthews correlation coefficient of 0.8 and then obtain the area under the curve of 0.93. Using the derived separation dba score cut-off, they predict 523 novel miRNA candidates. Further analysis reveal that these are located in genomic regions with (UCSC) MAF block fragmentation and poor sequence conservation, which in part might explain why they have been overlooked in previous efforts.
The researchers further analyzed known miRNAs from human and mouse and found two distinct classes containing two block or $>2$ block respectively, where the latter class hold profiles having less well defined arrangement of reads. They further compared the read profiles specific for plant and animals respectively, in terms of both length and distribution of reads within the profiles. They observed that some read profiles were specific for the two kingdoms respectively.
Availability: All data as well as a server to search miRBase profiles by uploading a BED file is available at http://rth.dk/resources/dba/mirna.
- Pundhir S, Gorodkin J. (2013) MicroRNA discovery by similarity search to a database of RNA-seq profiles. Frontiers in Bioinform & Comp Biol [Epub ahead of print]. [abstract]
Incoming search terms:
- www rna-seqblog com microrna-discovery-by-similarity-search-to-a-database-of-rna-seq-profiles
- rna-seq blog encode
- encode rna seq guidelines
- rna-seq database bam
- rna seq blog mirna poll
- database for rna seq results
- Pundhir S Gorodkin J (2013) MicroRNA discovery by similarity search to a database of RNA-seq profiles Frontiers in Bioinform & Comp Biol [Epub ahead of print] [abstract]
- rna seq mirna tophat small rnas
- practise data set rna-seq
- rna seq guidelines and practices encode
Apr
1
The Queryable RNA Seq Database
Filed Under Databases | Leave a Comment
The purpose of the system is to automate and simplify as much as possible the process of analyzing RNA-Seq results data by storing it in a database and providing many options for querying it.

- Goal 1: Provide a system through which Biologists can analyze their RNA-Seq results data, specifically differential expression tests, novel transcript discoveries, and assembled transcripts.
- Goal 2: The system should allow the user to get meaningful results with minimal learning time. For this goal to be satisfied, two senior Biologists familiar with RNA-seq must approve the system.
Availability – The Queryable RNA Seq Database is available online at: https://github.com/fatPerlHacker/queryable-rna-seq-database
Incoming search terms:
- Queryable RNA-Seq Database
- www rna-seqblog com the-queryable-rna-seq-database
Mar
22
TIARA genome database – update 2013
Filed Under Databases | Leave a Comment
The Total Integrated Archive of short-Read and Array (TIARA) database stores and integrates human genome data generated from multiple technologies including next-generation sequencing and high-resolution comparative genomic hybridization array. The TIARA genome browser is a powerful tool for the analysis of personal genomic information by exploring genomic variants such as SNPs, indels and structural variants simultaneously. As of September 2012, the TIARA database provides raw data and variant information for 13 sequenced whole genomes, 16 sequenced transcriptomes and 33 high resolution array assays. Sequencing reads are available at a depth of ∼30× for whole genomes and 50× for transcriptomes. Information on genomic variants includes a total of ∼9.56 million SNPs, 23 025 of which are non-synonymous SNPs, and ∼1.19 million indels. In this update, by adding high coverage sequencing of additional human individuals, the TIARA genome database now provides an extensive record of rare variants in humans. Following TIARA’s fundamentally integrative approach, new transcriptome sequencing data are matched with whole-genome sequencing data in the genome browser. Users can here observe, for example, the expression levels of human genes with allele-specific quantification. Improvements to the TIARA genome browser include the intuitive display of new complex and large-scale data sets.
Availability: TIARA database is available online at – http://tiara.gmi.ac.kr
- Hong D, Lee J, Bleazard T, Jung H, Ju YS, Yu SB, Kim S, Park SS, Kim JI, Seo JS. (2013) TIARA genome database: update 2013. Database (Oxford) [Epub ahead of print]. [article]
Incoming search terms:
- beta binomial trac nbic
- Integrated genome database
- rna-seq software database
- nextera rna seq
- tiara seq
- trinity 20in 2013 torrent
- understanding rna sequencing results
- www rna-seqblog com tiara-genome-database-update-2013
Feb
11
RNA-eXpress annotates novel transcript features in RNA-seq data
Filed Under Databases, Other Tools | Leave a Comment
Next generation sequencing is rapidly becoming the approach of choice for transcriptional analysis experiments. Substantial advances have been achieved in computational approaches to support these technologies. These approaches typically rely on existing transcript annotations, introducing a bias towards known genes, require specific experimental design and computational resources, or focus only on identification of splice variants (ignoring other biologically relevant transcribed features contained within the data that may be important for downstream analysis). Biologically relevant transcribed features also include large and small non-coding RNA, new transcription start sites, alternative promoters, RNA editing and processing of coding transcripts. Also, many existing solutions lack accessible interfaces required for wide scale adoption.
Researchers at the Monash Institute of Medical Research, Monash University, Australia have developed a user-friendly, rapid and computation-efficient feature annotation framework (RNA-eXpress) that enables identification of transcripts and other genomic and transcriptional features independently of current annotations. RNA-eXpress accepts mapped reads in the standard binary alignment (BAM) format and produces a study-specific feature annotation in GTF format, comparison statistics, sequence extraction and feature counts. The framework is designed to be easily accessible while allowing advanced users to integrate new feature-identification algorithms through simple class extension, thus facilitating expansion to novel feature types or identification of study specific feature types.
Availability and Implementation: RNA-eXpress software, source code, user manuals, supporting tutorials, developer guides and example data are available at http://www.rnaexpress.org.
Contact: paul.hertzog@monash.edu
- Forster S, Finkel A, Gould J, Hertzog P. (2013) RNA-eXpress annotates novel transcript features in RNA-seq data Bioinformatics [Epub ahead of print]. [abstract]
Incoming search terms:
- novel transcripts
- rna express transcriptome assembly
- tophat unmapped bam noval transcripts
- eXpress bioconductor RNA reads
- find novel transcripts as existing genes in other
- galaxy genome express
- INRA RNA seq DATABASE
- miRNA mediated translation regulation in plants
- rna pea galacy
Feb
6
The 2013 Nucleic Acids Research Database Issue and the online molecular biology database collection
Filed Under Databases, Publications | Leave a Comment
The 20th annual Database Issue of Nucleic Acids Research includes 176 articles, half of which describe new online molecular biology databases and the other half provide updates on the databases previously featured in NAR and other journals. This year’s highlights include two databases of DNA repeat elements; several databases of transcriptional factors and transcriptional factor-binding sites; databases on various aspects of protein structure and protein-protein interactions; databases for metagenomic and rRNA sequence analysis; and four databases specifically dedicated to Escherichia coli. The increased emphasis on using the genome data to improve human health is reflected in the development of the databases of genomic structural variation (NCBI’s dbVar and EBI’s DGVa), the NIH Genetic Testing Registry and several other databases centered on the genetic basis of human disease, potential drugs, their targets and the mechanisms of protein-ligand binding. Two new databases present genomic and RNAseq data for monkeys, providing wealth of data on our closest relatives for comparative genomics purposes. The NAR online Molecular Biology Database Collection has been updated and currently lists 1512 online databases.
The NAR online Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/cap/.
The full content of the Database Issue is freely available online on the Nucleic Acids Research website: http://nar.oxfordjournals.org/.
- Fernández-Suárez XM, Galperin MY. (2013) The 2013 Nucleic Acids Research Database Issue and the online molecular biology database collection. Nucleic Acids Res 41(Database issue):D1-7. [article]
Incoming search terms:
- RNA-seq pictures
- RNAseq library illumina
- fusionmap gsnap
- nucleic acid research database
- rna seq heat map
- RNA seq meta pipeline
- RNA seq pdf
- rna-seq analysis differential express protocol
- rna-seq and tophat versus cufflinks
- Database issue of Nucleic Acids Research analysis
Dec
3
miRGator v3.0 – a microRNA portal for deep sequencing, expression profiling and mRNA targeting
Filed Under Databases, Expression and Quantification | Leave a Comment
Biogenesis and molecular function are two key subjects in the field of microRNA (miRNA) research. Deep sequencing has become the principal technique in cataloging of miRNA repertoire and generating expression profiles in an unbiased manner.
A team led by researchers at Ewha Womans University, Korea have updated miRGator to version v3.0. miRGator compiles the deep sequencing miRNA data available in public and the team has implemented several novel tools to facilitate exploration of massive data. The miR-seq browser supports users to examine short read alignment with the secondary structure and read count information available in concurrent windows. Features such as sequence editing, sorting, ordering, import and export of user data would be of great utility for studying iso-miRs, miRNA editing and modifications. miRNA-target relation is essential for understanding miRNA function. Coexpression analysis of miRNA and target mRNAs, based on miRNA-Seq and RNA-Seq data from the same sample, is visualized in the heat-map and network views where users can investigate the inverse correlation of gene expression and target relations, compiled from various databases of predicted and validated targets. By keeping datasets and analytic tools up-to-date, miRGator should continue to serve as an integrated resource for biogenesis and functional investigation of miRNAs.
Availability – miRGator v3.0 update is available at: http://mirgator.kobic.re.kr
Cho S, Jang I, Jun Y, Yoon S, Ko M, Kwon Y, Choi I, Jang H, Ryu D, Lee B, Kim VN, Kim W, Lee S. (2012) miRGator v3.0: a microRNA portal for deep sequencing, expression profiling and mRNA targeting. Nucleic Acids Res [Epub ahead of print]. [article]
Incoming search terms:
- deep sequencing of serum microrna
- RNA-Seq and microRNA expression profiling reveal networks of RNA interactions in regenerating dorsal root ganglion neurons
- microRNA blog
- mirgator v3 0
- rna sequencing deep sequencing flow chart lc sciences
- mirna-seq processing
- mirgator trouble
- microrna and ptt
- Error running long_spanning_reads:Loading fusions
- does rna-seq identify microrna
Sep
15
RhesusBase – a knowledgebase for the monkey research community
Filed Under Databases | Leave a Comment
Although the rhesus macaque is a unique model for the translational study of human diseases, currently its use in biomedical research is still in its infant stage due to error-prone gene structures and limited annotations. Here, we present RhesusBase for the monkey research community (http://www.rhesusbase.org). Read more
Incoming search terms:
- RHESUS MONKEY
- easyrnaseq abundance
- rhesus base li cy
- rhesus macaques
Aug
23
What are the RNA-Seq models in Ensembl, and how were they determined? How does RNA-Seq data contribute to Ensembl gene sets? Can I upload my own RNA-Seq data to Ensembl? Answers to these questions and more…
Aug
6
The Pea RNA-Seq gene atlas
Filed Under Databases | Leave a Comment
Pea (Pisum sativum L.), with its high protein seeds and its ability to establish a symbiosis with soil nitrogen fixing bacteria, is a strategic crop in temperate regions. Moreover, pea is a long-standing model in genetics and physiology. This web-portal provides the first full-length Unigene set expression atlas for pea. Twenty pea cDNA libraries were prepared from different above- and below- ground cv “Cameor” plant organs, at different stages, and for different nutrition conditions. Libraries were sequenced using Next-Generation Sequencing technologies. Sequences were assembled de novo and a full-length Unigene set was produced. The sequencing depth of each cDNA contig relates to the expression level of transcripts. This gene atlas presents the pattern of expression and thus provides useful functional information for each cDNA contig. In the future, new RNA-Seq experiments will be added to this portal to enlarge the atlas’ scope.
The Pea RNA-Seq Gene Atlas is available at: http://bios.dijon.inra.fr/FATAL/cgi/pscam.cgi
Full-length de novo assembly of new pea RNA-seq data reveals the complexity of the pea transcriptome, S. Alves-Carvalho et al. in prep.
Incoming search terms:
- GeneAtlas microarray blogs
Aug
1
from the miRBase Blog – By Sam
miRBase 19 is now available, brought to you from the Benasque RNA meeting in the sunny Pyrenees, and with a slightly larger time gap than usual. In that extended time, we have added more than the usual number of new sequences — 3171 new hairpins and 3625 novel mature products, bringing the totals to 21264 and 25141 respectively in 193 species. As always, the full README file is available on the FTP site, along with downloadable files containing all data in various formats. Read more
Incoming search terms:
- mirbase statistics
- how many human mrna sequence mirbase release 19
- mirbase 19
- mirbase gtf tophat
- rat mirbase bed
Jul
20
Incorporating RNA-seq data into the Zebrafish Ensembl Gene Build
Filed Under Databases | Leave a Comment
Ensembl gene annotation provides a comprehensive catalogue of transcripts aligned to the reference sequence. It relies on publicly available species specific and orthologous transcripts plus their inferred protein sequence. The accuracy of gene models is improved by increasing the species specific component which can be cost-effectively achieved using RNA-Seq. Two zebrafish gene annotations are presented in Ensembl version 62 built on the Zv9 reference sequence.
Firstly, RNA-Seq data from five tissues and seven developmental stages were assembled into 25,748 gene models. A 3′ end capture and sequencing protocol was developed to predict the 3′ ends of transcripts and 46.1% of the original models were subsequently refined. Read more
Incoming search terms:
- helicos
- zebrafish rna-seq
- rna seq zebrafish
- Ensembl RNA-Seq gene model
May
15
miRFANs – an online database for Arabidopsis thaliana miRNA function annotations
Filed Under Databases | Leave a Comment
miRFANs, an online database for Arabidopsis thaliana miRNA function annotations. The creators integrated various type of datasets, including miRNA-target interactions, transcription factor (TF) and their targets, expression profiles, genomic annotations and pathways, into a comprehensive database, and developed various statistical and mining tools.
miRFANs consists of:
- Comprehensive collection of miRNA targets for Arabidopsis thaliana provides valuable information about the functions of plant miRNAs.
- Highly informative miRNA-mediated genetic regulatory network is extracted from our integrative database.
- Set of statistical and mining tools is equipped for analyzing and mining the database.
- User-friendly web interface is developed to facilitate the browsing and analysis of the collected data.
miRFANs is freely available at: http://www.cassava-genome.cn/mirfans
- Liu H, Jin T, Liao R, Wan L, Xu B, Zhou S, Guan J. (2012) miRFANs: an integrated database for Arabidopsis thaliana microRNA function annotations. BMC Plant Biology [Epub ahead of print]. [abstract]
Incoming search terms:
- miRFANs
- mirna seq
- mirfan
- arabidopsis mirna database
- mirna target arabidopsis
- arabidopsis miRNA target
- Mirfan com
- functional annotation of mirnas
- mir fan com
- Plant MicroRNA Database
Apr
13
Visual Exploration and Statistics to Promote Annotation (VESPA)
Filed Under Databases, Other Tools | Leave a Comment

VESPA is a desktop JavaTM application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST.
VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data.
The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php
- Peterson ES, McCue LA, Schrimpe-Rutledge AC, Jensen JL, Walker H, Kobold MA, Webb SR, Payne SH, Ansong CK, Adkins JN, Cannon WR, Webb-Robertson BJ. (2012) VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data. BMC Genomics 13(1), 131. [article]
Incoming search terms:
- transcriptomics software vespa vs


.png)











