Implementation of an Open Source Software solution for Laboratory Information Management and automated RNA-seq data analysis

Large-scale cancer genomics initiatives and next-generation sequencing for transcriptome profiling allow for detailed molecular characterization of tumors, and provide opportunities for clinical tools to improve diagnosis, prognosis, and treatment decisions. Laboratory information, data management, and data sharing in large-scale genomics projects is a challenge. Aiming to introduce such technologies in a clinical setting offer additional challenges associated with requirements of short lead-times and specialized tracking of biomaterials, data, and analysis results.

Using the free open-source BioArray Software Environment (BASE) and extension package Reggie researchers at Lund University have implemented a laboratory information management system and an automated RNAseq data analysis pipeline that successfully manage a large regional cancer genomics initiative. The system manages enrolled cancer patients, tumor biopsies, extraction of nucleic acid, and whole transcriptome RNA-sequencing through to data analysis and quality control. The implementation offers integration of laboratory equipment and operating procedures, and information tracking in a module based fashion enabling efficient and flexible use of personnel resources. The system provides two-factor authentication and transaction control and seamless integration of freely available software for RNAseq analysis such as Tophat, Cufflinks, and Picard.


Schematic overview of the implemented solution for laboratory information management and automated RNAseq data analysis.

As of February 2016 more than 8000 patients and over 6000 tumor biopsies have been successfully processed. Lead-time from biopsy arrival to summarized reports based on RNAseq data is less than 5 days, in line with regional clinical requirements. BASE and Reggie are freely available and released as open-source under the GNU General Public License and GNU Affero General Public License, respectively.

Using free open-source software together with BASE and a customized extension package, Reggie, the Lund researchers have implemented a system capable of managing large collections of quality controlled and curated material for use in research and development and tailored to meet requirements for clinical use. Featuring high degree of automation and interactivity the system allows for resource efficient laboratory procedures and short lead-times with demonstrated use of RNAseq data analyses in a clinical setting.

Hakkinen J, Nordborg N, Mansson O, Vallon-Christersson J. (2016) Implementation of an Open Source Software solution for Laboratory Information Management and automated RNAseq data analysis in a large-scale Cancer Genomics initiative using BASE with extension package Reggie. bioRXiv [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.