scphaser – haplotype inference using single-cell RNA-seq data

Determination of haplotypes is important for modelling the phenotypic consequences of genetic variation in diploid organisms, including cis-regulatory control and compound heterozygosity. Karolinska Institute researchers realized that single cell RNA-seq (scRNA-seq) data is well suited for phasing genetic variants, since both transcriptional bursts and technical bottlenecks cause pronounced allelic fluctuations in individual single cells.

Here the researchers present scphaser, an R package that phases alleles at heterozygous variants to reconstruct haplotypes within transcribed regions of the genome using scRNA-seq data. The devised method efficiently and accurately reconstructed the known haplotype for ≥93% of phasable genes in both human and mouse. It also enables phasing of rare and de novo variants and variants far apart within genes, which is hard to attain with population-based computational inference.

Concept and performance of scphaser

rna-seq

(A) Number of genes against observed ASE in scRNA-seq (two human and a mouse dataset) and bulk RNA-seq data. Line indicates mean and band the inter-quartile range across cells. (B) Transcriptional bursts and technical drop-out cause frequent monoallelic or allele-biased observations in scRNA-seq data, which can reveal the phase of transcribed sequences. (C) Percent correctly phased SNVs in the human and mouse dataset, X-axis labels denote the input, method and weighing settings for the phasing.

Availability –  scphaser is implemented as an R package. Tutorial and code are available at https://github.com/edsgard/scphaser

Contactrickard.sandberg@ki.se

Edsgärd D, Reinius B, Sandberg R. (2016) scphaser: haplotype inference using single-cell RNA-seq data. Bioinformatics [Epub ahead of print]. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.