The genome-wide expression profile of genes in different tissues/cell types and developmental stages is a vital component of many functional genomic studies. Transcriptome data obtained by RNA-sequencing (RNA-Seq) is often deposited in public databases that are made available via data portals. Data visualization is one of the first steps in assessment and hypothesis generation. However, these databases do not typically include visualization tools and establishing one is not trivial for users who are not computational experts. This, as well as the various formats in which data is commonly deposited, makes the processes of data access, sharing and utility more difficult. Our goal was to provide a simple and user-friendly repository that meets these needs for data-sets from major agricultural crops.
AgriSeqDB is a database for viewing, analysing and interpreting developmental and tissue/cell-specific transcriptome data from several species, including major agricultural crops such as wheat, rice, maize, barley and tomato. The disparate manner in which public transcriptome data is often warehoused and the challenge of visualizing raw data are both major hurdles to data reuse. The popular eFP browser does an excellent job of presenting transcriptome data in an easily interpretable view, but previous implementation has been mostly on a case-by-case basis.
Here, La Trobe University researchers present an integrated visualisation database of transcriptome data-sets from six species that did not previously have public-facing visualisations. They combine the eFP browser, for gene-by-gene investigation, with the Degust browser, which enables visualisation of all transcripts across multiple samples. The two visualisation interfaces launch from the same point, enabling users to easily switch between analysis modes. The tools allow users, even those without bioinformatics expertise, to mine into data-sets and understand the behaviour of transcripts of interest across samples and time. The researchers have also incorporated an additional graphic download option to simplify incorporation into presentations or publications.
Powered by eFP and Degust browsers, AgriSeqDB is a quick and easy-to-use platform for data analysis and visualization in five crops and Arabidopsis. Furthermore, it provides a tool that makes it easy for researchers to share their data-sets, promoting research collaborations and data-set reuse.
a the Arabidopsis time-series data-set is shown here as an example, displaying transcripts that are up-regulated in S samples but down-regulated in SL samples (Top panel). The user can select which samples they wish to see with the checkboxes in the top left of screen along with the method of analysis (voom/limma, edgeR, or voom). In the top right, the user can control the rendering and thresholds of using the options dialog. All genes that match filters above are shown in a heat-map, which clusters genes with similar levels of expression (Middle panel). Running the mouse-over each gene highlights it in the plots above. Table showing all matching genes in tabular format with the expression levels for each sample, false discovery rate and any extra annotation columns provided in the data-set (Lower panel). In the top centre the user can limit genes by using 1 of 3 interactive plots, and the parallel coordinates plot allows the user to limit genes by their log fold gene expression (per sample). b Example of an MA plot. Users can limit genes by drawing a box around genes on the on the MA plot; the two samples used for the MA plot are specified in the Options dialog (top right). c An MDS plot showing groupings of the individual replicates of each sample. Data is from transcriptome of Arabidopsis seed during germination
Availability – https://expression.latrobe.edu.au/agriseqdb