RNA-seq Data – Challenges in and Recommendations for Experimental Design and Analysis

RNA-seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele-specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high-quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired-end or single-end reads, sequence length, and sequencing depth. Common analysis steps in all RNA-seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance.

The authors make recommendations for common components of experimental design and assess tool capabilities for each of these steps. They also test tools designed to detect differential expression, since this is the most widespread application of RNA-seq. Their hope is that these analyses will help guide those who are new to RNA-seq and will generate discussion about remaining needs for tool improvement and development.

Williams AG, Thomas S, Wyman SK, Holloway AK. (2014) RNA-seq Data: Challenges in and Recommendations for Experimental Design and Analysis. Curr Protoc Hum Genet [Epub ahead of print]. [abstract]