Studies attempting to functionally interpret complex-disease susceptibility loci by GWAS and eQTL integration have predominantly employed microarrays to quantify gene-expression. RNA-Seq has the potential to discover a more comprehensive set of eQTLs and illuminate the underlying molecular consequence.
Researchers from King’s College London examine the functional outcome of 39 variants associated with Systemic Lupus Erythematosus (SLE) through integration of GWAS and eQTL data from the TwinsUK microarray and RNA-Seq cohort in lymphoblastoid cell lines. They use conditional analysis and a Bayesian colocalisation method to provide evidence of a shared causal-variant, then compare the ability of each quantification type to detect disease relevant eQTLs and eGenes. They discovered a greater frequency of candidate-causal eQTLs using RNA-Seq, and identified novel SLE susceptibility genes that were concealed using microarrays (e.g. NADSYN1, SKP1, and TCF7). Many of these eQTLs were found to influence the expression of several genes, suggesting risk haplotypes may harbour multiple functional effects. The researchers pinpointed eQTLs modulating expression of four non-coding RNAs; three of which were replicated in whole-blood. Novel SLE associated splicing events were identified in the T-reg restricted transcription factor, IKZF2, the autophagy-related gene WDFY4, and the redox coenzyme NADSYN1, through asQTL mapping using the Geuvadis cohort. They have significantly increased our understanding of the genetic control of gene-expression in SLE by maximising the leverage of RNA-Seq and performing integrative GWAS-eQTL analysis against gene, exon, and splice-junction quantifications. In doing so, the researchers have identified novel SLE candidate genes and specific molecular mechanisms that will serve as the basis for targeted follow-up studies.
Two-stage cis-eQTL annotation pipeline for SLE susceptibility loci
SLE susceptibility variants (Table 1) were annotated using residualized expression or summary-level eQTL statistics from three expression datasets: microarray probe-level expression data, and both gene-level and exon-level RNA-Seq quantifications. Each expression dataset was generated from LCLs from individuals of the TwinsUK cohort. A) We undertook cis-eQTL analysis of +/-1Mb intervals around each SNP and associations with q<0.05 after FDR adjustment were taken forward. B) Summary-level data from significant cis-eQTLs were tested for evidence of a shared causal variant using firstly conditional analysis using the TwinsUK genetic data as a reference panel, then colocalisation analysis to test for a single causal variant common to both traits. Associations passing these thresholds (described fully in methods) were classified as candidate-causal eQTLs and eGenes. Summary results per quantification type for significant and candidate-causal associations are shown in Table 3 for microarray, Table 4 for RNA-Seq (gene-level), and Table 5 for RNA-Seq (exon-level).