With the recent advent of RNA-seq technology the proteomics community has begun to generate sample-specific protein databases for peptide and protein identification, an approach we call proteomics informed by transcriptomics (PIT). This approach has gained a lot of interest, particularly among researchers who work with non-model organisms or with particularly dynamic proteomes such as those observed in developmental biology and host-pathogen studies. PIT has been shown to improve coverage of known proteins, and to reveal potential novel gene products. However, many groups are impeded in their use of PIT by the complexity of the required data analysis. Necessarily, this analysis requires complex integration of a number of different software tools from at least two different communities, and because PIT has a range of biological applications a single software pipeline is not suitable for all use cases.
To overcome these problems, a team led by researchers at Queen Mary University of London has created GIO, a software system that utilises the well-established Galaxy platform to make PIT analysis available to the typical bench scientist via a simple web interface. Within GIO they provide workflows for four common use cases: a standard search against a reference proteome; PIT protein identification without a reference genome; PIT protein identification using a genome guide; and PIT genome annotation. These workflows comprise individual tools that can be reconfigured and rearranged within the web interface to create new workflows to support additional use cases.
Availability – A demonstration of this customised Galaxy version, Galaxy Integrated Omics (GIO), is freely accessible at gio.sbcs.qmul.ac.uk. A repository of the core GIO components, together with an automated installation program, is maintained on GitHub (https://github.com/wizardfan/gio-repository) so that groups who wish to add GIO functionality to their own Galaxy server can do so.