The transcriptome is the entire set of RNA transcripts in a given cell for a specific developmental stage or physiological condition. Understanding the transcriptome is necessary for interpreting the functional elements of the genome as well as for understanding the underlying mechanisms of development and disease. Microarray technologies have been used for high-throughput large-scale RNA-level studies, such as to identify differentially expressed genes between developmental stages or between healthy and diseased groups. However, its hybridization- based nature limits the ability to catalog and quantify RNA molecules expressed under various conditions. Advances in massive parallel DNA sequencing technologies have enabled transcriptome sequencing (RNA-seq) by sequencing of cDNA. RNA-seq has rapidly replaced microarray technology because of its better resolution and higher reproducibility; this method can be used to extend our knowledge of alternative splicing events, novel genes and transcripts, and fusion transcripts.
In this review, the authors introduce routine RNA-seq workflow together with related software, focusing particularly on transcriptome reconstruction and expression quantification.
Typical workflow for RNA sequencing (RNA-seq) data analysis.
This workflow shows an example for expression quantification and differential expression analysis at gene and/or transcript level using RNA-seq, which is typically consisted of five steps as following: preprocessing, read alignment, transcriptome reconstruction, expression quantification and differential expression analysis. QC, quality control.
|Preprocessing of Raw Data||Raw Data QC||FastQC|||
|Read Alignment||Unspliced Aligner||MAQ|||
|RNA-Seq Specific Quality Control||RNA-SeQC|||
|Transcriptome Reconstruction||Reference Guided||Cufflinks|||
|Expression Quantification||Gene-level Quantification||ALEXA-Seq|||