Bringing RNAseq data analysis back in the hands of Biologists


In the last few years Next Generation Sequencing wiped away expression array technology and  RNAseq became the most used technique for expression analysis.

Our group has a strong experience in Bioinformatics training for Biologists and we feel their frustration of being unable to run basic RNAseq analyses without the access to a high-end hardware infrastructure.

Cloud based approaches as BaseSpace or Galaxy provide to Biologists these infrastructures, but with some limitations, e.g. year license cost in case of BaseSpace.

We thought that a stand alone, cheap but powerful machine, would represent an ideal solution for labs with small size RNAseq experiments, e.g. few samples run in a year.

We searched for high-end consumer hardware in the area of game computers, which must guarantee performances that are over the standard requirements for the everyday work of a PC.

We find out that Intel released a bare mini PC (NUC6I7KYK, 1/2 A4 size), which could be equipped with up to two SSD disks and up to 32 GB RAM .

We integrated in this hardware the consolidated RNAseq, miRNAseq and ChIPseq pipelines present in the Reproducible Bioinformatics Project, under the control of a user friendly GUI.

We were positively surprised of the performances of such little computer, e.g. using bcl2fastq for demultiplexing and STAR-RSEM for samples quantification, the NUC6I7KYK can  process the 450 million reads generated by a NexSeq 500 in 20 hours. Since a standard run of a NextSeq 500 is approximately 30 hours, this little computer can effectively work as data processing server for a NexSeq 500 machine.

On the basis of these results we packed RNAseq, miRNAseq and ChIPseq pipelines in a cost-effective implementation of a hardware/software solution for computing intensive tasks called SeqBox.

More info on it can be obtained in our recent Application Note on Bioinformatics:

SeqBox: RNAseq/ChIPseq reproducible analysis on a consumer game computer

Abstract: Short reads sequencing technology has been used for more than a decade now. However, the analysis of RNAseq and ChIPseq data is still computational demanding and the simple access to raw data does not guarantee results reproducibility between laboratories. To address these two aspects, we developed SeqBox, a cheap, efficient and reproducible RNAseq/ChIPseq hardware/software solution based on NUC6I7KYK mini-PC (an Intel consumer game computer with a fast processor and a high performance SSD disk), and Docker container platform. In SeqBox the analysis of RNAseq and ChIPseq data is supported by a friendly GUI. This allows access to fast and reproducible analysis also to scientists with/without scripting experience.

