Recent studies highlight the crucial regulatory roles of transposable elements (TEs) on proximal gene expression in distinct biological contexts such as disease and development. However, computational tools extracting potential TE -proximal gene expression associations from RNA-sequencing data are still missing.
Researchers from the Izmir Biomedicine and Genome Center have developed a novel R package, using a linear regression model, for studying the potential influence of TE species on proximal gene expression from a given RNA-sequencing data set. This R package, namely TEffectR, makes use of publicly available RepeatMasker TE and Ensembl gene annotations as well as several functions of other R-packages. It calculates total read counts of TEs from sorted and indexed genome aligned BAM files provided by the user, and determines statistically significant relations between TE expression and the transcription of nearby genes under diverse biological conditions.
The workflow of TEffectR package
The package contains six core functions for the identification of the potential links between TEs and nearby genes at genome-wide scale. TEffectR requires two inputs provided by the user: (i) a raw gene count matrix and (ii) genomic alignments of sequencing reads in BAM file format.
Availability – TEffectR is freely available at https://github.com/karakulahg/TEffectR