Predicting causal variants affecting expression using whole genome sequence and RNA-seq from multiple human tissues

Genetic association mapping produces statistical links between phenotypes and genomic regions, but identifying the causal variants themselves remains difficult. Complete knowledge of all genetic variants, as provided by whole genome sequence (WGS), will help, but is currently financially prohibitive for well powered GWAS studies.

To explore the advantages of WGS in a well powered setting, researchers from the University of Geneva Medical School performed eQTL mapping using WGS and RNA-seq, and showed that the lead eQTL variants called using WGS are more likely to be causal. They derived properties of the causal variant from simulation studies, and used these to propose a method for implicating likely causal SNPs. This method predicts that 25% – 70% of the causal variants lie in open chromatin regions, depending on tissue and experiment. Finally, the researchers identify a set of high confidence causal variants and show that they are more enriched in GWAS associations than other eQTL. Of these, they find 65 associations with GWAS traits and show examples where the gene implicated by expression has been functionally validated as relevant for complex traits.

rna-seq

Distribution of the CaVEMaN estimated causal probabilities for all lead eQTLs, brokendown by tissue.

Availability – Code for correcting the expression datasets for multiple eQTLs, running the CaVEMaN method and converting the CaVEMaN score to a causal probability can be found here: https://github.com/funpopgen/CaVEMaN.

Brown AA, Viñuela A, Delaneau O, Spector T, Small K, Dermitzakis E. (2016) Predicting causal variants affecting expression using whole genome sequence and RNA-seq from multiple human tissues. bioRXiv [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.