ContextMap – Mining RNA-Seq Data for Infections and Contaminations

RNA sequencing (RNA–seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes.

Now, researchers at Ludwig-Maximilians-University of Munich, Germany have developed the mapping software, ContextMap, a tool for screening of sequencing reads for contaminations and infections. Based on mapping–derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences) of species/strains can be assessed.


In this paper, the performance of ContextMap is evaluated on three real–life sequencing data sets and compared to state–of–the–art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non–unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. This study illustrates the importance and potentials of routinely mining RNA–seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

Availability – ContextMap is available at:

  • Bonfert T, Csaba G, Zimmer R, Friedel CC (2013) Mining RNA–Seq Data for Infections and Contaminations. PLoS ONE 8(9), e73071. [article]