– have an understanding of the nature of RNA-Seq data
– be able to design an RNA-Seq experiment for detection of differential expression between conditions
– be able to execute a complete analysis pipeline in Galaxy to detect differential expression
– be able to interpret differential expression results by gene set analysis
We will start with a thorough description of the properties of RNA-Seq data. This understanding helps us to get a decent experimental design setup: how many sample do I need? How deep do they need to be sequenced? Which type of sequencing fits my scientific question?
Next, we will investigate the properties of raw RNA-Seq data as generated by the Illumina platform, by performing quality control. The necessity of preprocessing will be discussed.
We will extensively cover the situation in which a reference genome of the species under investigation is available. Alternative approaches will be touched upon. After mapping, quality control of the mapped reads will allow us to fine-tune mapping parameters to perform our final mapping.
We will next extract a count table, on which we can examine the properties and correlations of our samples.
On the extracted count table, we will discuss normalization and transformation. DESeq2 will be used to detect differential gene expression, and we will see tips to optimize the detection power, and perform quality control of our analysis, mostly by plotting analytical graphs. As well designs with one factor as multiple factors will be considered.
Lastly, different methods to interpret differential gene expression results, such as gene set analysis, we provide us valuable biological insights.
The workshop will make use of Galaxy to perform all analysis steps. Galaxy is an extensible framework for bioinformatics analysis: we will make use of tools and R-scripts implemented in Galaxy by BITS to perform all analyses. Which commands and R-scripts ultimately get executed by Galaxy when using a tool, can be readily shown to avoid any black box experience.
– Galaxy and basic Galaxy tools (‘Filter and sort’, ‘Text manipulation’)
– Fastq-mcf (ea-utils) and other preprocessing tools
– bamqc (Qualimap suite)
– familiarity with the Illumina sequencing process.
– familiarity with basic NGS data formats: FASTQ, SAM/BAM, GTF, …
– familiarity with the use of Galaxy: visualising data, searching and running tools, saving and loading histories
If you do not meet these requirements you can follow the “Introduction to the analysis of NGS data”
Topics NOT covered
– RNA-seq assembly
– RNA-seq analysis for isoform detection
– RNA-seq analysis for detection of short RNA species
The training will take 2 days. The schedule will be added soon.