With the emergence of large-scale sequencing platforms since 2005, there has been a great revolution regarding methods for decoding DNA sequences, which have also affected quantitative and qualitative gene expression analyses through the RNA-Sequencing technique. However, issues related to the amount of data required for the analyses have been considered because they affect the reliability of the experiments. Thus, RNA depletion during sample preparation may influence the results. Moreover, because data produced by these platforms show variations in quality, quality filters are often used to remove sequences likely to contain errors to increase the accuracy of the results. However, when reads of quality filters are removed, the expression profile in RNA-Seq experiments may be influenced.
A recent study by the Federal University Pará, Brazil aimed to analyze the impact of different quality filter values on RNA-Seq data generated on both the SOLiD and Illumina platforms . Although up to 47.9% of the reads produced by the SOLiD technology were removed after the QV20 quality filter is applied, and 15.85% were removed from the data set using the QV30 filter, Illumina data showed the largest number of unique differentially expressed genes after applying the most stringent filter (QV30), with 69 genes. In contrast, for SOLiD, the acid stress condition with the QV20 filter yielded only 41 unique differentially expressed genes. Even for the highest quality data, the quality filter affected the expression profile. The most stringent quality filter generated a greater number of unique differentially expressed genes: 9 for high molecular weight dissolved organic matter condition and 12 for low P conditions.