Informatics for RNA-seq Analysis (2015)


Course Objectives

High-throughput sequencing of RNA libraries (RNA-seq) has become increasingly common and largely supplanted gene microarrays for transcriptome profiling. When processed appropriately, RNA-seq data has the potential to provide a considerably more detailed view of the transcriptome. The CBW has developed a 2-day course providing an introduction to RNA-seq data analysis followed by integrated tutorials demonstrating the use of popular RNA-seq analysis packages. The tutorials are designed as self-contained units that include example data (Illumina paired-end RNA-seq data) and detailed instructions for installation of all required bioinformatics tools (TopHat, Cufflinks, etc.).

Participants will gain practical experience and skills to be able to:

  • Align RNA-seq data to a reference genome (required)
  • Estimate known gene and transcript expression
  • Perform differential expression analysis
  • Discover novel isoforms
  • Visualize and summarize the output of RNA-seq analyses

Target Audience

Graduates, postgraduates and PIs working with or about to embark on an analysis of RNA-seq data. Attendees may be familiar with some aspect of RNA-seq analysis (e.g. gene expression analysis) or have no direct experience. A reference genome is required.

Prerequisites for attendance:

Basic familiarity with Linux environment and S, R, or Matlab. Must be able to complete and understand the following simple Linux and R tutorials (up to and including “Descriptive Statistics”)before attending:

You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 1GB RAM, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact for more information.

Course Outline

Day 1

Module 1 – Introduction to Cloud Computing (2015) (Faculty: Malachi Griffith)

  • Introduction to cloud computing concepts

Lab practical: Learn to configure, launch and connect to an Amazon cloud instance

Module 2 – Introduction to RNA sequencing and analysis (2015) (Faculty: Malachi Griffith)

  • Basic introduction to biology of RNA-seq
  • Experimental design and analysis considerations
  • Commonly asked questions

Lab Practical:

  • Introduction to the test data
  • Examine and understand the format of raw FastQ files
  • Obtain reference genomes (fasta) and gene annotation resources (GTF/GFF)
  • Perform pre-alignment QC

Module 3 – RNA-seq alignment and visualization (2015) (Faculty: Obi Griffith)

  • Use of Bowtie/TopHat
  • Introduction to the BAM format
  • Basic manipulation of BAMs with samtools, Picard etc.
  • Visualization of RNA-seq alignments – IGV
  • BAM read counting and determination of variant allele expression status

Lab Practical:

  • Run Bowtie2/TopHat2 with parameters suitable for gene expression analysis
  • Use samtools to explore the features of the SAM/BAM format and perform basic manipulation of these alignment files (view, sort, index, manipulate headers, extract data, etc.)
  • Use IGV to visualize TopHat2 alignments, view a variant position, load exon junctions files, etc.

Day 2

Module 4 – Expression and Differential Expression (2015) (Faculty: Obi Griffith)

  • Get FPKM style expression estimates using Cufflinks
  • Perform differential expression analysis with Cuffdiff
  • Perform summary analysis with CummeRbund

Downstream interpretation of expression analysis (multiple testing, clustering, heatmaps, classification, pathway analysis, etc) will also be discussed.

Lab Practical:

  • Run Cufflinks, Cuffdiff, and CummeRbund
  • Explore the output of these in R

Module 5 – Isoform Discovery and Alternate Expression (2015) (Faculty: Malachi Griffith)

  • Explore use of Cufflinks in reference annotation based transcript (RABT) assembly mode and ‘de novo’ assembly mode. Both modes require a reference genome sequence.

Lab Pracitical: Run Cufflinks in alternate modes more conducive to isoform discovery and explore the results

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.