Researchers at Uppsala University, Sweden have used a comprehensive simulation approach to explore how various features of the transcriptome (complexity, degree of polymorphism π, alternative splicing), technological processing (sequencing error ε, library normalization) and bioinformatic workflow (de novo vs. mapping assembly, reference genome quality) impact transcriptome quality and inference of differential gene expression (DE).
- That transcriptome assembly and gene expression profiling (EdgeR vs. BaySeq software) works well even in the absence of a reference genome and is robust across a broad range of parameters.
- They advise against library normalization and in most situations advocate mapping assemblies to an annotated genome of a divergent sister clade, which generally outperformed de novo assembly (Trans-Abyss, Trinity, Soapdenovo-Trans).
- That transcriptome complexity (size, paralogs, alternative splicing isoforms) negatively affected the assembly and DE profiling, whereas the effects of sequencing error and polymorphism were almost negligible.
- Both mapping strategies and the quality of reference genomes are very important.
Transcriptome Shotgun Sequencing (RNA-seq) has been readily embraced by geneticists and molecular ecologists alike. As with all high-throughput technologies, it is critical to understand which analytic strategies are best suited and which parameters may bias the interpretation of the data.
For more on the effects of methods on differential expression results, see http://www.rna-seqblog.com/publications/effects-of-the-method-on-estimation-of-dge/
- Vijay N, Poelstra JW, Künstner A, Wolf JB. (2012) Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments. Mol Ecol [Epub ahead of print]. [abstract]