A protocol to help new RNA-seq users understand the basic steps necessary to analyze an RNA-seq dataset properly

As a revolutionary technology for life sciences, RNA-seq has many applications and the computation pipeline has also many variations. Researchers from the Functional Genomics Center Zurich describe a protocol to perform RNA-seq data analysis where the aim is to identify differentially expressed genes in comparisons of two conditions. The protocol follows the recently published RNA-seq data analysis best practice and applies quality checkpoints throughout the analysis to ensure reliable data interpretation. It is written to help new RNA-seq users to understand the basic steps necessary to analyze an RNA-seq dataset properly. An extension of the protocol has been implemented as automated workflows in the R package ezRun, available also in the data analysis framework SUSHI, for reliable, repeatable, and easily interpretable analysis results.

Open source software packages used in this protocol

Name

Hyperlink to the project home

UCSC utility scripts

http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/

NCBI SRA Toolkit

https://trace-ncbi-nlm-nih-gov.ezproxyhost.library.tmc.edu/Traces/sra/sra.cgi?view=software

FastQC

http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Trimmomatic

http://www.usadellab.org/cms/?page=trimmomatic

STAR

https://github.com/alexdobin/STAR

SAMtools

http://samtools.sourceforge.net/

RSeQC

http://rseqc.sourceforge.net/

featureCounts

http://bioinf.wehi.edu.au/featureCounts/

R

https://www.r-project.org/

ezRun

https://github.com/uzh/ezRun

SUSHI

https://github.com/uzh/sushi

 

Qi W, Schlapbach R, Rehrauer H. (2017) RNA-Seq Data Analysis: From Raw Data Quality Control to Differential Expression Analysis. Methods Mol Biol 1669:295-307. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.