RNA sequencing (RNA-seq) is a powerful approach for measuring gene expression levels in cells and tissues, but it relies on high-quality RNA. Researchers at the Johns Hopkins School of Medicine demonstrate here that statistical adjustment using existing quality measures largely fails to remove the effects of RNA degradation when RNA quality associates with the outcome of interest. Using RNA-seq data from molecular degradation experiments of human primary tissues, they introduce a method-quality surrogate variable analysis (qSVA)-as a framework for estimating and removing the confounding effect of RNA quality in differential expression analysis. They show that this approach results in greatly improved replication rates (>3×) across two large independent postmortem human brain studies of schizophrenia and also removes potential RNA quality biases in earlier published work that compared expression levels of different brain regions and other diagnostic groups. This approach can therefore improve the interpretation of differential expression analysis of transcriptomic data from human tissue.
qSVA improves replication across independent datasets
SZ-control expression differences modeled using four statistical models in the LIBD (discovery) and CMC (replication) datasets. For a given significance threshold in the discovery dataset, the researchers computed the replication rate (same fold-change direction for case status and P < 0.05) in the replication dataset. The qSVA approach had the highest replication rate, and the covariate-adjusted and SVA approaches had the lowest replication rates.
Availability – The qSVA approach is available in the SVA Bioconductor package (https://bioconductor.org/packages/sva)
Jaffe AE, Tao R, Norris AL, Kealhofer M, Nellore A, Shin JH, Kim D, Jia Y, Hyde TM, Kleinman JE, Straub RE, Leek JT, Weinberger DR. (2017) qSVA framework for RNA quality correction in differential expression analysis. Proc Natl Acad Sci U S A. [article] [Epub ahead of print]