Biostate AI launches total RNA sequencing and free data analysis AI

Biostate AI, the scalable biodata foundry startup, emerges from stealth today with the launch of two service products: Total RNA sequencing and Copilot for RNAseq data analysis. Biostate AI aims to partner and collaborate with academic researchers, hospital biorepositories, and pharma/biotech companies, leveraging its new technologies for scalable multiomic data collection, scientific discovery, and AI training.

“The successful training of any AI well requires large quantities of relevant and high-quality data,” says David Zhang, co-founder and CEO of Biostate AI. “Biostate AI has developed the instrumental technologies to facilitate the collection of more biological data at lower costs. We are pleased to offer these capabilities to academic and industry partners and collaborators.”


Biostate’s Total RNA sequencing uses its patent-pending Barcode-Integrated Reverse Transcription (BIRT) technology to affordably, scalably, and comprehensively analyze all types of RNA. In contrast, traditional gene expression profiling using RNA sequencing (RNAseq) typically only captures information from messenger RNA, which accounts for less than 10% of all known RNA species. Biologically important classes of non-coding RNA include long non-coding RNAs (lncRNAs), micro RNAs (miRNAs), circular RNAs (circRNAs), and enhancer RNAs (eRNAs).

Biostate AI has filed 9 pending patents on its technologies, and collaborates with a number of industry partners on new technology development, including Twist Bioscience. Biostate AI also recently exclusively in-licensed further intellectual property (IP) from the California Institute of Technology (Caltech) to expand the range of biomolecules analyzed. These technologies will also reduce the amount of animal testing performed by pharma and biotech companies in preclinical studies.

To date, Biostate AI has raised more than $4M in venture funding. Matter Venture Partners led the funding round, with participation from institutional investors Vision Plus CapitalCatapult VC, and the California Institute of Technology through the Caltech Seed Fund. Individual investors in the round included Dario Amodei (CEO of Anthropic), Joris Poort (CEO of Rescale), Michael Schnall-Levin (CTO of 10X Genomics) and Emily Leproust (CEO of Twist Bioscience).

“AI is the next frontier and AI needs data, and biological data is a lot harder to get than text or images. We are excited about the potential for Biostate’s technology to dramatically lower the cost of collecting RNAseq datasets,” said Haomiao Huang, Founding Partner at Matter Venture Partners. “As a US company, Biostate’s affordable AI-embedded CRO services are much needed today as the supply of preclinical research services shrinks due to geopolitical tensions.”

Biostate AI also launched OmicsWeb Copilot ( today, a conversational AI launched to help biologists analyze and visualize data. Copilot leverages state of the art large-language models (LLMs) to understand user requests and intent to build customized software and scripts for data analysis. In addition to analyzing the user’s own uploaded data, Copilot provides access to over 1000 unique RNAseq datasets collected by the Biostate team. Copilot is being fine-tuned on 5000 proprietary RNAseq datasets, enabling advanced analyses and anomaly detection. Biostate AI offers the Copilot platform at no cost to academic and nonprofit users and researchers.

“Bioinformatic analysis of RNAseq and other omics data is a highly complex, multi-step process that currently takes many hours of dedicated specialized programming,” said Ashwin Gopinath, co-founder and CTO of Biostate AI. “As we scaled up our RNAseq data collection in the past year, we started building OmicsWeb Copilot as an internal tool to help our scientists make sense of the data. And then we realized other people may also find this tool useful, so we’re opening it up to the general public for free.”

The ultimate goal of Biostate AI is to build AI that can predict human and animal health changes, including toxicity and efficacy responses to drugs. The team has recently demonstrated RNA expression in blood taken from rats before drug dosing can predict survival with a Hazard Ratio of 8. To scale this proof-of-concept demonstration to prediction of toxicity in humans for novel drugs, far more data must be collected, analyzed, and fed into AI models for training. In this course of this data collection, petabytes of RNAseq and other omics data must be collected, interpreted, and tokenized.


Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.