Polyester: simulating RNA-seq datasets with differential transcript expression

Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially-constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with col- lections of RNA-seq reads. The main advantage of Polyester is the ability to simulate isoform-level differential expression across biological replicates for a variety of experimental designs at the read level. Differential expression signal can be simulated with either built-in or user-defined statistical models.


Availability – Polyester is available on GitHub at https://github.com/alyssafrazee/polyester.

Frazee AC, Jaffe AE, Langmead B, Leek J. (2014) Polyester: simulating RNA-seq datasets with differential transcript expression. bioRxiv [Epub ahead of print]. [abstract]