Improve the statistical power of your cis-eQTL mapping for RNA-Seq data

Studies of expression quantitative trait loci (eQTLs) offer insight into the molecular mechanisms of loci that were found to be associated with complex diseases and the mechanisms can be classified into cis- and trans-acting regulation. At present, high-throughput RNA sequencing (RNA-seq) is rapidly replacing expression microarrays to assess gene expression abundance. Unlike microarrays that only measure the total expression of each gene, RNA-seq also provides information on allele-specific expression (ASE), which can be used to distinguish cis-eQTLs from trans-eQTLs and, more importantly, enhance cis-eQTL mapping. However, assessing the cis-effect of a candidate eQTL on a gene requires knowledge of the haplotypes connecting the candidate eQTL and the gene, which cannot be inferred with certainty. The existing two-stage approach that first phases the candidate eQTL against the gene and then treats the inferred phase as observed in the association analysis tends to attenuate the estimated cis-effect and reduce the power for detecting a cis-eQTL.

In this article, a team led by researchers at Emory University provide a maximum-likelihood framework for cis-eQTL mapping with RNA-seq data. Their approach integrates the inference of haplotypes and the association analysis into a single stage, and is thus unbiased and statistically powerful. The team also developed a pipeline for performing a comprehensive scan of all local eQTLs for all genes in the genome by controlling for false discovery rate, and implemented the methods in a computationally efficient software program. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to empirical breast cancer data from The Cancer Genome Atlas project.

rna-seq

Power of the ASE model for testing a cis-eQTL in the rst simulation study. The nominal signi cance level is 0.05. MLE, IMP, and TRUE are the maximum-likelihood method, the two-stage method, and the method using the true phase, respectively. Each power estimate is based on 10,000 replicates.

Availability – TRECASE_MLE is available at: http://web1.sph.emory.edu/users/yhu30/software.html

Hu YJ, Sun W, Tzeng JY, Perou CM. (2015) Proper Use of Allele-Specific Expression Improves Statistical Power for cis-eQTL Mapping with RNA-Seq Data. J Am Stat Assoc 110(511):962-974. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.