DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition

University of Montpellier researchers have developed a k-mer-based computational protocol, DE-kupl, for capturing local RNA variation in a set of RNA-seq libraries, independently of a reference genome or transcriptome. DE-kupl extracts all k-mers with differential abundance directly from the raw data files. This enables the retrieval of virtually all variation present in an RNA-seq data set. This variation is subsequently assigned to biological events or entities such as differential long non-coding RNAs, splice and polyadenylation variants, introns, repeats, editing or mutation events, and exogenous RNA. Applying DE-kupl to human RNA-seq data sets identified multiple types of novel events, reproducibly across independent RNA-seq experiments.

 The DE-kupl pipeline for the discovery and analysis of differentially expressed k-mers


First, Jellyfish is applied to count k-mers in all libraries. k-mers counts are then joined into a count matrix and filtered for low recurrence and matching to the reference transcriptome. Normalization factors are computed from raw k-mer counts and the differential expression procedure is applied. Finally, overlapping differentially expressed k-mers are extended into contigs and annotated based on their alignment to the reference and overlap with annotated genes

Audoux J, Philippe N, Chikhi R, Salson M, Gallopin M, Gabriel M, Le Coz J, Drouineau E, Commes T, Gautheret D. (2017) DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition. Genome Biol 18(1):243. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.