Comparison of Computational Methods for Identification of Allele-Specific Expression based on Next Generation Sequencing Data

Allele-specific expression (ASE) studies have wide-ranging implications for genome biology and medicine. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying ASE, but suffers from mapping bias favoring reference alleles. Two categories of methods are adopted nowadays, to reduce the effect of mapping bias on ASE identification-normalizing RNA allelic ratio with the parallel genomic allelic ratio (pDNAar) and modifying reference genome to make reads carrying both alleles with the same chance to be mapped (mREF).

Researchers at the Shanghai Institutes for Biological Sciences compared the sensitivity and specificity of both methods with simulated data, and demonstrated that the pDNAar, though ideally practical, was lower in sensitivity, because of its lower mapping rate of reads carrying nonreference (alternative) alleles, although mREF achieved higher sensitivity and specificity for its efficiency in mapping reads carrying both alleles. Application of these two methods in real sequencing data showed that mREF were able to identify more ASE loci because of its higher mapping efficiency, and able to correcting some seemly incorrect ASE loci identified by pDNAar due to the inefficiency in mapping reads carrying alternative alleles of pDNAar. This study provides useful information for RNA sequencing data processing in the identification of ASE.

rna-seq

Liu Z, Yang J, Xu H, Li C, Wang Z, Li Y, Dong X, Li Y. (2014) Comparing Computational Methods for Identification of Allele-Specific Expression based on Next Generation Sequencing Data. Genet Epidemiol [Epub ahead of print]. [abstract]