Upcoming Workshop – A Beginner’s Guide to RNA-Seq Data Analysis

Quality Control, Read Mapping, Visualization and Downstream Analyses

When? 1 – 5 February 2016

Where? iad Pc-Pool, Rosa-Luxemburg-Straße 23, Leipzig, Germany

Scope and Topics

The purpose of this workshop is to get a deeper understanding in Next-Generation Sequencing (NGS) with a special focus on bioinformatics issues. Additionally, all workshop participants should be enabled to perform important tasks of NGS data analysis tasks themselves.

The first workshop module is an introduction to data analysis using Linux, assuring that all participants are able to follow the practical parts. The second module dicusses advantages and disadvantages of current sequencing technologies and their implications on data analysis. The most important NGS file formats (fastq, sam/bam, bigWig, etc.) are introduced and one proceeds with first hands-on analyses (QC, mapping, visualization). You will learn how to read and interprete QC plots, clip adapter sequences and/or trim bad quality read ends, get bioinformatics backgrounds about the read mapping and understand its problems (dynamic programming, alignment visualization, NGS mapping heuristics, etc.), perform your own mapping statistics and visualize your data in different ways (IGV, UCSC, etc.). The last module adresses a specific applications of NGS: RNA-seq data analysis and detection of differentially expressed genes.

Workshop Structure

This workshop has been redesigned and adapted to the needs of beginners in the field of NGS bioinformatics and comprises this three course modules:

  1. Linux for Bioinformatics:
    This module will introduce the essential tools and file formats required for NGS data analysis. It helps to overcome the first hurdles when entering this (for NGS analyses) unavoidable operating system.
  2. Introduction to NGS data analysis:
    Different methods of NGS will be explained, the most important notations be given and first analyses be performed. This course covers essential knowledge for analysing data of many different NGS applications.
  3. RNA-seq Data Analyses: RNA-Seq for model-organisms

Detailed Program

Linux for bioinformatics

  • Introduction to the command line and important commands
  • Cobining commands by piping and redirection
  • Introduction to bioinformatics file formats (e.g. FASTA, BED, VCF, WIG) and databases (e.g. UCSC, ENSEMBL)
  • Usage of important bioinformatics toolkits (BEDtools, UCSCtools)
  • Introduction to R

Tuesday and Wednesday
Introduction to NGS data analysis

  • Introduction to sequencing technologies from a data analysts view
  • Raw sequence files (FASTQ format)
  • Preprocessing of raw reads: quality control (FastQC), adapter clipping, quality trimming
  • Introduction to read mapping (Alignment methods, Mapping heuristics)
  • Read mapping (BWA, Bowtie2, STAR, segemehl)
  • Mapping output (SAM/BAM format)
  • Usage of important NGS toolkits (samtools, BEDtools)
  • Mapping statistics
  • Visualization of mapped reads (IGV, UCSC)

Thursday and Friday
RNA-seq Data Analyses

  • Understand split-read mapping
  • Run different split-read mappers (tophat, segemehl, STAR)
  • Understand the Tuxedo Suite (cufflinks, cuffcompare, cuffmerge, cuffdiff, etc.)
  • Predict new transcripts/isoforms using cufflinks/cuffmerge
  • Quantify exons/genes/transcripts
  • Predict
    • Differential exon usage using DEXseq
    • Differential gene expression using DEseq
    • Differential isoform expression using cuffdiff
  • Predict non-standard transcripts (circularized RNAs and/or fusion transcripts)


  • basic understanding of molecular biology (DNA, RNA, gene expression, PCR, …)
  • For the Introduction to NGS Data Analysis and downstream courses: basic linux & bioinformatics knowledge (shell usage, common commands and tools). You should be familiar with the commands covered in the Learning the Shell Tutorial

Target Audience

  • biologists or data analysts with no or little experience in analyzing RNA-Seq data

Included in the Course

  • Course materials
  • Catering
  • Conference Dinner


Gero Doose (University of Leipzig) found and published several circularized RNAs in various RNA-Seq experiments. He specialized on split-read analysis some years ago and has a strong expertise in downstream analyses.

Christian Otto (CCR Bio-IT) is one of the developers of the split-read mapping tool segemehl and is an expert on implementing efficient algorithms for HTS data analyses.

David Langenberger (ecSeq Bioinformatics) started working with small non-coding RNAs in 2006. Since 2009 he uses HTS technolgies to investigate these short regulatory RNAs as well as other targets. He has been part of several large HTS projects, for example the International Cancer Genome Consortium (ICGC).

Mario Fasold (ecSeq Bioinformatics) works in the analysis of microarray data since 2007 and developed several bioinformatics tools such as the Bioconductor package AffyRNADegradation and the Larpack program package. Since 2011 he specialized in the field of HTS data analysis and helped analysing sequecing data of several large consortium projects.

Key Dates

Opening Date of Registration: 1 Juli 2015
Closing Date of Registration: 15 January 2016
Workshop: 1 – 5 February 2016 (8 am – 5 pm)


Location: iad Pc-Pool, Rosa-Luxemburg-Straße 23, Leipzig, Germany
Language: English
Available seats: 24 (first-come, first-served)

Registration fees:

registration fee: 1,390 EUR (without VAT)

Travel expenses and accommodation are not covered by the registration fee.


ecSeq Bioinformatics
04275 Leipzig
Email: events@ecSeq.com

(learn more…)


Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.