Transcriptome software results show significant variation among different commercial pipelines

Researchers at New Mexico State University have been documenting the biological responses to low levels of radiation (natural background) and very low level radiation (below background), and thus these studies are testing mild external stimuli to which they would expect relatively mild biological responses. The researchers recently published a transcriptome software comparison study based on RNA-Seqs from a below background radiation treatment of two model organisms, E. coli and C. elegans (Thawng and Smith, BMC Genomics 23:452, 2022). They reported DNAstar-D (Deseq2 in the DNAstar software pipeline) to be the more conservative, realistic tool for differential gene expression compared to other transcriptome software packages (CLC, Partek and DNAstar-E (using edgeR). Here they report two follow-up studies (one with a new model organism, Aedes aegypti and another software package (Azenta) on transcriptome responses from varying dose rates using three different sources of natural radiation.

When E. coli was exposed to varying levels of K40, the researchers again found that the DNAstar-D pipeline yielded a more conservative number of DEGs and a lower fold-difference than the CLC pipeline and DNAstar-E run in parallel. After a 30 read minimum cutoff criterion was applied to the data, the number of significant DEGs ranged from 0 to 81 with DNAstar-D, while the number of significant DEGs ranged from 4 to 117 and 14 to 139 using DNAstar-E and the CLC pipelines, respectively. In terms of the extent of expression, the highest foldchange DEG was observed in DNAstar-E with 19.7-fold followed by 12.5-fold in CLC and 4.3-fold in DNAstar-D. In a recently completed study with Ae. Aegypti and using another software package (Azenta), they analyzed the RNA-Seq response to similar sources of low-level radiation and again found the DNAstar-D pipeline to give the more conservative number and fold-expression of DEGs compared to other softwares. The number of significant DEGs ranged 31-221 in Azenta and 31 to 237 in CLC, 19-252 in DNAstar-E and 0-67 in DNAStar-D. The highest fold-change of DEGs were found in CLC (1,350.9-fold), with DNAstar-E (5.9 -fold) and Azenta (5.5-fold) intermediate, and the lowest levels of expression (4-fold) found in DNAstar-D.

The different program mapping, normalization and statistical approaches
used for each of the software pipelines used in this study

Fig. 1

This study once again highlights the importance of choosing appropriate software for transcriptome analysis. Using three different biological models (bacteria, nematode and mosquito) in four different studies testing very low levels of radiation (Van Voorhies et al., Front Public Health 8:581796, 2020; Thawng and Smith, BMC Genomics 23:452, 2022; current study), the CLC software package resulted in what appears to be an exaggerated gene expression response in terms of numbers of DEGs and extent of expression. Setting a 30-read cutoff diminishes this exaggerated response in most of the software tested. These researchers have further affirmed that DNAstar-Deseq2 gives a more conservative transcriptome expression pattern which appears more suitable for studies expecting subtle gene expression patterns.

Thawng CN, Smith GB. (2023) Transcriptome software results show significant variation among different commercial pipelines. BMC Genomics 24(1):662. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.