Flexible expressed region analysis for RNA-seq with derfinder

Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. Researchers at Johns Hopkins University previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly.

The researchers present the derfinder software that improves their annotation-agnostic approach to RNA-seq analysis by:

  1. implementing a computationally efficient bump-hunting approach to identify DERs which permits genome-scale analyses in a large number of samples,
  2. introducing a flexible statistical modeling framework, including multi-group and time-course analyses and,
  3. introducing a new set of data visualizations for expressed region analysis.

They apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete.

Finding regions via expressed region-level approach on chromosome 5 with BrainSpan data set


A Mean coverage with segments passing the mean cuto (0.25) marked as regions. B Raw coverage curves superimposed with the candidate regions. Coverage curves are colored by brain region and developmental stage (NCX: Neocortex: Non-NCX: Non-neocortex, CBC: cerebellum, F: fetal, P: postnatal). C Known exons (dark blue) and introns (light blue) by strand for genes and subsequent transcripts in the locus. The DERs best support the GABRA6 transcript with a red star, indicating the presence of a di erentially expressed transcript.

Availability – The package is available from Bioconductor at www.bioconductor.org/packages/derfinder

Torres LC, Nellore A, Frazee AC, Wilks C, Love MI, Irizarry RA, Leek J, Jaffe AE. (2016) Flexible expressed region analysis for RNA-seq with derfinder. bioRXiv [Epub ahead of print]. [abstract]

