When cellular traits are measured using high-throughput DNA sequencing, quantitative trait loci (QTLs) manifest as fragment count differences between individuals and allelic differences within individuals. Researchers from the Wellcome Trust Sanger Institute have developed RASQUAL (Robust Allele-Specific Quantitation and Quality Control), a new statistical approach for association mapping that models genetic effects and accounts for biases in sequencing data using a single, probabilistic framework. RASQUAL substantially improves fine-mapping accuracy and sensitivity relative to existing methods in RNA-seq, DNase-seq and ChIP-seq data. They illustrate how RASQUAL can be used to maximize association detection by generating the first map of chromatin accessibility QTLs (caQTLs) in a European population using ATAC-seq. Despite a modest sample size, the researchers identified 2,707 independent caQTLs (at a false discovery rate of 10%) and demonstrated how RASQUAL and ATAC-seq can provide powerful information for fine-mapping gene-regulatory variants and for linking distal regulatory elements with gene promoters. These results highlight how combining between-individual and allele-specific genetic signals improves the functional interpretation of noncoding variation.
Throughout, reference (ref.) and alternative (alt.) alleles are colored blue and red and coded 0 and 1, respectively, while reference and alternative haplotypes are colored orange and green, respectively. (a) The plot illustrates the two sources of input data to RASQUAL: between-individual and allele-specific (AS) signals, as observed from sequence data. The left panel shows that the fragment count (FC) is proportional to rSNP genotype, and the right panel illustrates how the two signals are connected by the cis-regulatory effect π after conversion of allele-specific counts into haplotype-specific expression. (b) Visual representation of the key RASQUAL features and parameters. Overdispersion introduces greater heterogeneity in the allele-specific count than would be expected under binomial assumption. RASQUAL models the overdispersion in allele-specific counts and total fragment counts with a single parameter, θ. Genotyping error introduces complete allelic imbalance (AI) when a homozygote is miscalled as a heterozygote. Haplotype switching produces inconsistency of allelic imbalance among the SNPs within an individual. Reference bias occurs when sequenced reads containing the alternative allele(s) are unmappable to the correct location. RASQUAL employs a parameter ϕ that captures the excess of allelic imbalance beyond the genetic effect π. Sequencing/mapping error introduces additional allelic imbalance or genotype inconsistency. RASQUAL explicitly models the proportion of reads that are erroneously sequenced or mapped to incorrect genomic locations by parameter δ to allow imperfect sequencing results. Imprinting introduces strong allelic imbalance that can confound the detection of genetic effects.
Availability – RASQUAL software and documentation can be accessed at: https://github.com/dg13/rasqual