Although RNA sequencing (RNA-seq) has become the most advanced technology for transcriptome analysis, it also confronts various challenges. As we all know, the workflow of RNA-seq is extremely complicated and it is easy to produce bias. This may damage the quality of RNA-seq dataset and lead to an incorrect interpretation for sequencing result. Thus, our detailed understanding of the source and nature of these biases is essential for the interpretation of RNA-seq data, finding methods to improve the quality of RNA-seq experimental, or development bioinformatics tools to compensate for these biases. Researchers from Southeast University discuss the sources of experimental bias in RNA-seq and for each type of bias, they discuss the methods for improvement.
Simplified protocol of RNA-seq experiment and sources of bias
(a) Sample preservation and isolation. These biases can include sample degradation, DNA contamination. (b) Strategies for cDNA library construction. ①: the RNA directly converts to cDNA; then, cDNA was fragmented and library preparation. ②: classical a protocol. One method involves reverse transcription (RT) using random primers first, subsequently adapter ligations and sequencing (left). The other method is to first sequentially ligate 3′ and 5′ adapters, followed by performing cDNA synthesis with a primer complementary to the adapter (RT-primer), subsequently sequencing (right). On using the RT primer with a specific sequence, mispriming could occur due to annealing of the RT-primer to transcript sequences with some complementarity (RT mispriming). (c) RNA-seq platform (including Pyrosequencing, sequencing-by-synthesis, and single-molecule sequencing). These biases can be introduced by insertions and deletions, raw single-pass data, etc.