NEUMA (Normalization by Expected Uniquely Mappable Area) is a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. NEUMA, is based on effective length normalization using uniquely mappable areas of gene and mRNA isoform models. Using the known transcriptome sequence model such as RefSeq, NEUMA pre-computes the numbers of all possible gene-wise and isoform-wise informative reads: the former being sequences mapped to all mRNA isoforms of a single gene exclusively and the latter uniquely mapped to a single mRNA isoform. The results are used to estimate the effective length of genes and transcripts, taking experimental distributions of fragment size into consideration. Quantitative RT-PCR based on 27 randomly selected genes in two human cell lines and computer simulation experiments demonstrated superior accuracy of NEUMA over other recently developed methods. NEUMA covers a large proportion of genes and mRNA isoforms and offers a measure of consistency (‘consistency coefficient’) for each gene between an independently measured gene-wise level and the sum of the isoform levels. NEUMA is applicable to both paired-end and single-end RNA-Seq data. We propose that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data.
NEUMA is available at: http://neuma.kobic.re.kr/
Lee S, Seo CH, Lim B, Yang JO, Oh J, Kim M, Lee S, Lee B, Kang C, Lee S. (2011) Accurate quantification of transcriptome from RNA-Seq data by effective length normalization. Nucleic Acids Res 39(2), e9. [abstract]