Blood-based diagnostics tests, using individual or panels of biomarkers, may revolutionize disease diagnostics and enable minimally invasive therapy monitoring. However, selection of the most relevant biomarkers from liquid biosources remains an immense challenge. Researchers at the VU University Medical Center recently presented the thromboSeq pipeline, which enables RNA sequencing and cancer classification via self-learning and swarm intelligence-enhanced bioinformatics algorithms using blood platelet RNA. Here, they provide the wet-lab protocol for the generation of platelet RNA-sequencing libraries and the dry-lab protocol for the development of swarm intelligence-enhanced machine-learning-based classification algorithms. The wet-lab protocol includes platelet RNA isolation, mRNA amplification, and preparation for next-generation sequencing. The dry-lab protocol describes the automated FASTQ file pre-processing to quantified gene counts, quality controls, data normalization and correction, and swarm intelligence-enhanced support vector machine (SVM) algorithm development. This protocol enables platelet RNA profiling from 500 pg of platelet RNA and allows automated and optimized biomarker panel selection. The wet-lab protocol can be performed in 5 d before sequencing, and the algorithm development can be completed in 2 d, depending on computational resources. The protocol requires basic molecular biology skills and a basic understanding of Linux and R. In all, with this protocol, we aim to enable the scientific community to test platelet RNA for diagnostic algorithm development.
Overview of PSO-enhanced thromboSeq
a, Whole blood is subjected to platelet isolation, total RNA isolation, SMARTer cDNA synthesis, and Illumina TruSeq labeling. Labeled cDNA is subsequently sequenced using Illumina sequencing (e.g., the Hiseq 2500 or 4000 platform), and RNA-sequencing reads can be subjected to the PSO-enhanced thromboSeq pipeline (as indicated by the bird symbol, referring to swarm intelligence). The detailed protocol is described in Steps 1–109. b, Representative examples of platelet isolation. Whole blood is collected in purple-capped EDTA-coated Vacutainers and centrifuged for 20 min at 120g. The platelet-rich plasma (upper yellow layer) separates from the red blood cell layer (lowest layer) and the buffy coat (in between). The platelet-rich plasma is collected in a 15-mL tube and centrifuged for 20 min at 360g. A platelet pellet appears and, following removal of the platelet-depleted plasma, the platelet pellet is resuspended in RNAlater. c, Schematic representation of the PSO meta-algorithm for classification algorithm development. Platelet-RNA-sequencing samples assigned to the training series (top) are used for algorithm development, which is represented as a bin with a bird, referring to swarm intelligence. This results in a dedicated parameter-specific spliced RNA biomarker panel. The performance of this algorithm is assessed in the evaluation series (bottom) and enables PSO to adjust the parameter settings toward those expected to be more stringent and successful (arrow toward the bin). Upon successful parameter selection and algorithm training, the performance is assessed in an independent validation series (lower-right corner). PRP, platelet-rich plasma.