Amongst the major challenges in next-generation sequencing experiments are exploratory data analysis, interpreting trends, identifying potential targets/candidates, and visualizing the results clearly and intuitively. These hurdles are further heightened for researchers who are not experienced in writing computer code, since the majority of available analysis tools require programming skills. Even for proficient computational biologists, an efficient and replicable system is warranted to generate standardized results.
Researchers at Tel Aviv University have developed RNAlysis, a modular Python-based analysis software for RNA sequencing data. RNAlysis allows users to build customized analysis pipelines suiting their specific research questions, going all the way from raw FASTQ files, through exploratory data analysis and data visualization, clustering analysis, and gene-set enrichment analysis. RNAlysis provides a friendly graphical user interface, allowing researchers to analyze data without writing code. The researchers demonstrate the use of RNAlysis by analyzing RNA data from different studies using C. elegans nematodes. They note that the software is equally applicable to data obtained from any organism.
Top section: a typical analysis with RNAlysis can start at any stage from raw/trimmed FASTQ files, through more processed data tables such as count matrices, differential expression tables, or any form of tabular data.
Middle section: data tables can be filtered, normalized, and transformed with a wide variety of filtering functions, allowing users to clean up their data, fine-tune their analysis to their biological questions, or prepare the data for downstream analysis. RNAlysis also provides users with a broad assortment of customizable clustering methods, to help recognize genes with similar expression patterns, and visualization methods to aid in data exploration. All of these functions can be arranged into customized Pipelines that can be applied to multiple tables in one click, or exported and shared with others.
Bottom section: Once users have focused their data tables into gene sets of interest, or imported such gene sets from another source, they can use RNAlysis to visualize the intersection between different gene sets, extract lists of genes from any set operations applied to their gene sets and data tables, and perform enrichment analysis for their gene sets, using either public datasets such as GO and KEGG or customized, user-defined enrichment attributes.
RNAlysis is suitable for investigating a variety of biological questions, and allows researchers to more accurately and reproducibly run comprehensive bioinformatic analyses. It functions as a gateway into RNA sequencing analysis for less computer-savvy researchers, but can also help experienced bioinformaticians make their analyses more robust and efficient, as it offers diverse tools, scalability, automation, and standardization between analyses.
Availability – https://github.com/GuyTeichman/RNAlysis