A comparison of transcriptome analysis methods with reference genome

The application of RNA-seq technology has become more extensive and the number of analysis procedures available has increased over the past years. Selecting an appropriate workflow has become an important issue for researchers in the field.

In their study, researchers from Capital Medical University compared six popular analytical procedures/pipelines using four RNA-seq datasets from mouse, human, rat, and macaque, respectively. The gene expression value, fold change of gene expression, and statistical significance were evaluated to compare the similarities and differences among the six procedures. qRT-PCR was performed to validate the differentially expressed genes (DEGs) from all six procedures.

CufflinksCuffdiff demands the highest computing resources and KallistoSleuth demands the least. Gene expression values, fold change, p and q values of differential expression (DE) analysis are highly correlated among procedures using HTseq for quantification. For genes with medium expression abundance, the expression values determined using the different procedures were similar. Major differences in expression values come from genes with particularly high or low expression levels. HISAT2StringTieBallgown is more sensitive to genes with low expression levels, while KallistoSleuth may only be useful to evaluate genes with medium to high abundance. When the same thresholds for fold change and p value are chosen in DE analysis, StringTieBallgown produce the least number of DEGs, while HTseqDESeq2, –edgeR or –limma generally produces more DEGs. The performance of CufflinksCuffdiff and KallistoSleuth varies in different datasets. For DEGs with medium expression levels, the biological verification rates were similar among all procedures.

Guidelines for researchers to decide the appropriate procedure for RNA-seq analysis

Fig. 7

Results are highly correlated among RNA-seq analysis procedures using HTseq for quantification. Difference in gene expression values mainly come from genes with particularly high or low expression levels. Moreover, biological validation rates of DEGs from all six procedures were similar for genes with medium expression levels. Investigators can choose analytical procedures according to their available computer resources, or whether genes of high or low expression levels are of interest. If computer resources are abundant, one can utilize multiple procedures to obtain the intersection of results to get the most reliable DEGs, or to obtain a combination of results to get a more comprehensive DE profile for transcriptomes.

Liu X, Zhao J, Xue L, Zhao T, Ding W, Han Y, Ye H. (2022) A comparison of transcriptome analysis methods with reference genome. BMC Genomics 23(1):232. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.