New tool integrates microbiome and host genetic sequencing analysis

A new software tool makes it easier to study relationships between a host, its microbiome and pathogens like HIV or SARS-CoV-2.

Researchers at Texas Biomedical Research Institute and Tulane University have developed a new software tool that makes it easier, faster and more cost effective to analyze genetic information about a host and its microbiome at the same time. The software, called “meta-transcriptome detector” (MTD), can be used by a wide range of microbiologists and drug developers, including those researching diseases such as certain cancers, COVID-19, HIV/AIDS, malaria and many other human health conditions linked to microorganisms. The tool was recently published in the journal Briefings in Bioinformatics.

“It is very user-friendly, especially for researchers with little to no background in bioinformatics,” says Associate Professor Binhua “Julie” Ling, PhD, who co-leads Texas Biomed’s Host-Pathogen Interactions Research Program and is the senior author of the paper. “You only need to write one line of code to set certain parameters and the software does the rest automatically.”

MTD enables researchers to get a comprehensive snapshot of what microbes are present in the host – including the huge range of “good bacteria” that normally live on and inside people and animals, as well as the harmful ones, such as viruses that cause serious illnesses. More importantly, MTD analyzes gene expression activity, essentially which genes are turned on or off, in both the microbes and the host simultaneously, allowing researchers to easily spot relationships between them. For example, seeing which genes are active in a microbe and in the host could indicate that the activity of one is influenced by the other, and might suggest a potential target for treatment down the line, Ling explains.

An overview of MTD

(A): A workflow for bulk mRNA-seq analysis. (B): A workflow for single-cell mRNA-seq analysis. White boxes represent the reads in FASTQ format and the count matrix. Blue boxes show the bioinformatics software used. Green boxes are the additional tools for data processing. The white boxes with curved edges show the reference genome and databases. In the single-cell mRNA-seq workflow (B), the left side exemplifies the host reads process protocols, and the right side in yellow shadow shows the MTD automatic pipeline to calculate the count matrix for the microbiome reads and the correlation test between microbiome and host genes.

“We can’t say at this stage if it is cause and effect, but we can use this analysis to pinpoint what genes or pathways we should be investigating – perhaps ones that we never considered before as being related,” Ling says. “MTD can help accelerate that process and potentially open new avenues of research and drug development

MTD analyzes gene expression activity in both the host and microbiome at the same time, and automatically generates a graphic showing correlations between the two. Darker red shows a stronger positive correlation, meaning the upregulation or downregulation of a certain host gene or pathway is related to upregulation or downregulation of a certain type of microbe in the same direction. Darker blue shows a stronger negative correlation, meaning a host gene or pathway is less active but the microbe species is more active. While the analysis does not indicate one causes the other, it can highlight relationships for researchers to investigate further that could help to understand the cause-effect relationship for potential development of treatments. Credit: Texas Biomed

MTD links together several existing software packages and pulls from international databases containing RNA sequences of more than 100,000 species of bacteria, viruses, fungi, archaea and protozoa, as well as sequences of vectors and plasmids. Users are also able to update the database with specific sequences they are interested in.

First paper author Fei Wu, Ph.D., worked with Ling to study how the microbiome changes with age in younger and older monkeys with simian immunodeficiency virus (SIV), the monkey version of HIV.

“We were having to analyze gene expression from the host by one workflow, and the microbiome gene expression by a separate workflow,” Wu says. “We wondered why can’t we do both at the same time?”

Since their lab was relocating during the COVID-19 pandemic, they had the time to focus on computer-based work, and set out to solve this issue. Wu worked with Ling and collaborator Yao-Zhong Liu, PhD, an Associate Professor at Tulane University School of Public Health and Tropical Medicine, to build and test the new software.

“Normally, we are using bioinformatics software to analyze our data, not building it,” Wu says. “It was challenging, but exciting, to branch out and now have something that will not only help us, but also any other researcher doing RNA sequencing of hosts and microbiomes; from humans and monkeys, to mosquitos carrying malaria parasites and snails carrying schistosome parasites.”

MTD offers several advantages over existing tools. Primarily, it has the ability to analyze RNA sequences of both the host and the microbiota from the same sample – either single cells or a bulk tissue sample. This saves both time and resources, in addition to providing new insights into the relationships between the microorganisms and host. It does require some high-performance computing power, but otherwise is very accessible to researchers without much bioinformatics experience.

SourceTexas Biomedical Research Institute

Availability –  MTD  is available on github at:

Wu F, Liu YZ, Ling B. (2022) MTD: a unique pipeline for host and meta-transcriptome joint and integrative analyses of RNA-seq data. Brief Bioinform [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.