Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels

RNA-sequencing (RNA-seq) is a powerful technique for the identification of genetic variants that affect gene-expression levels, either through expression quantitative trait locus (eQTL) mapping or through allele-specific expression (ASE) analysis. Given increasing numbers of RNA-seq samples in the public domain, researchers from the University of Groningen studied to what extent eQTLs and ASE effects can be identified when using public RNA-seq data while deriving the genotypes from the RNA-sequencing reads themselves.

The researchers downloaded the raw reads for all available human RNA-seq datasets. Using these reads they performed gene expression quantification. All samples were jointly normalized and subjected to a strict quality control. They also derived genotypes using the RNA-seq reads and used imputation to infer non-coding variants. This allowed them to perform eQTL mapping and ASE analyses jointly on all samples that passed quality control. The results were validated using samples for which DNA-seq genotypes were available.

4,978 public human RNA-seq runs, representing many different tissues and cell-types, passed quality control.

rna-seq

Deelen P, Zhernakova DV, de Haan M, van der Sijde M, Bonder MJ, Karjalainen J, van der Velde KJ, Abbott KM, Fu J, Wijmenga C, Sinke RJ, Swertz MA, Franke L. (2015) Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels. Genome Med 7(1):30. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.