Recommendations for differential expression analysis and biomarker discovery small RNA-Seq experiments in an age of liquid biopsies

Small RNA-Seq has emerged as a powerful tool in transcriptomics, gene expression profiling and biomarker discovery. Sequencing cell-free nucleic acids, particularly microRNA (miRNA), from liquid biopsies additionally provides exciting possibilities for molecular diagnostics, and might help establish disease-specific biomarker signatures. The complexity of the small RNA-Seq workflow, however, bears challenges and biases that researchers need to be aware of in order to generate high-quality data. Rigorous standardization and extensive validation are required to guarantee reliability, reproducibility and comparability of research findings. Hypotheses based on flawed experimental conditions can be inconsistent and even misleading. Comparable to the well-established MIQE guidelines for qPCR experiments, this work aims at establishing guidelines for experimental design and pre-analytical sample processing, standardization of library preparation and sequencing reactions, as well as facilitating data analysis.

Researchers at the Technical University of Munich highlight bottlenecks in small RNA-Seq experiments, point out the importance of stringent quality control and validation, and provide a primer for differential expression analysis and biomarker discovery. Following these recommendations will encourage better sequencing practice, increase experimental transparency and lead to more reproducible small RNA-Seq results. This will ultimately enhance the validity of biomarker signatures, and allow reliable and robust clinical predictions.

Crucial steps and recommendations for small RNA-Seq data analysis

Step To consider Recommended tools or algorithms
Data pre-processing Trimming adapters Btrim, FASTX-Toolkit
Removing short reads
Quality control Library size and read distribution across samples Btrim, FASTX-Toolkit, FaQCs
Per base/sequence Phred score
Read length distribution
Assess degradation
Check for over-represented sequences
Read alignment (Filtering) Reference database or genome Bowtie, BWA, HTSEQ, SAMtools, SOAP2
Annotation
Mismatch rate
Handling of multi-reads
Normalization Library sizes and sequencing depth DESeq2, EdgeR, svaseq
Batch effects
Read distribution
Replication level
DGE analysis Data distribution DESeq2, EdgeR, SAMSeq, voom limma
Replication level
False discovery rate
Target prediction of miRNAs / siRNAs In silico prediction or experimental validation miRanda, miRTarBase, TarBase
Canonical and non-canonical target regulation
Biomarker identification Sensitivity Specificity Classification rate DESeq2, Simca-Q, Numerous R packages: base, pcaMethods, Mixomics

 

Buschmann D, Haberberger A, Kirchner B, Spornraft M, Riedmaier I, Schelling G, Pfaffl MW. (2016) Toward reliable biomarker signatures in the age of liquid biopsies – how to standardize the small RNA-Seq workflow. Nucleic Acids Res [Epub ahead of print]. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.