The Pacific oyster, Crassostrea gigas, is one of the most important aquaculture shellfish resources worldwide. Important efforts have been undertaken towards a better knowledge of its genome and transcriptome, which makes now C. gigas becoming a model organism among lophotrochozoans, the under-described sister clade of ecdysozoans within protostomes. These massive sequencing efforts offer the opportunity to assemble gene expression data and make such resource accessible and exploitable for the scientific community. Therefore, we undertook this assembly into an up-to-date publicly available transcriptome database: the GigaTON (Gigas TranscriptOme pipeliNe) database.
Researchers from the University of Caen Basse-Normandie have assembled 2204 million sequences obtained from 114 publicly available RNA-seq libraries that were realized using all embryo-larval development stages, adult organs, different environmental stressors including heavy metals, temperature, salinity and exposure to air, which were mostly performed as part of the Crassostrea gigas genome project. This data was analyzed in silico and resulted into 56621 newly assembled contigs that were deposited into a publicly available database, the GigaTON database. This database also provides powerful and user-friendly request tools to browse and retrieve information about annotation, expression level, UTRs, splice and polymorphism, and gene ontology associated to all the contigs into each, and between all libraries.
Schematic overview of the GigaTON pipeline search, browse and retrieve tool. The database can be used to investigate the transcriptome assembly by searching contigs (orange) or polymorphism variants (blue), or to download featured data sets and library BAM files (black). The search for a specific contig (or a contig subset) can be achieved either by differential analysis of library pools (Venn diagrams and Digital Differential Display (DDD)), either by BLAST or BioMart search. The BioMart portal (grey) enables to search a dataset using a combination of several filter criteria (shown in white), and to retrieve the results featuring a wide range of information characteristics and parameters, i.e. ‘attributes’ (shown in yellow). The resulting dataset (Selected contigs, red; Selected variants, shonw in turquoise) and associated attributes can be downloaded and/or browsed within the GigaTON pipeline for additional information. The latter include Sequence info (translation frames, ORF length…) or Depth (detailed view of expression level and assembly variants between libraries) for contigs (brown), or allele and feature views for variants (shown in purple). The number of user-exploitable fields/buttons is indicated in brackets under each corresponding criteria.
The GigaTON database provides a convenient, potent and versatile interface to browse, retrieve, confront and compare massive transcriptomic information in an extensive range of conditions, tissues and developmental stages in Crassostrea gigas. The GigaTON database constitutes the most extensive transcriptomic database to date in marine invertebrates, thereby a new reference transcriptome in the oyster, a highly valuable resource to physiologists and evolutionary biologists.
Availability – The GigaTON database is available at http://gigaton.sigenae.org. There are no restrictions for its use by non-academics.