RNA sequencing (RNA-seq) has become a widely used technology for analyzing global gene-expression changes during certain biological processes. It is generally acknowledged that RNA-seq data displays equidispersion and overdispersion characteristics; therefore, most RNA-seq analysis methods were developed based on a negative binomial model capable of capturing both equidispersed and overdispersed data.
Researchers from Hebei Normal University found that in addition to equidispersion and overdispersion, RNA-seq data also displays underdispersion characteristics that cannot be adequately captured by general RNA-seq analysis methods. Based on a double Poisson model capable of capturing all data characteristics, they developed a new RNA-seq analysis method (DREAMSeq). Comparison of DREAMSeq with five other frequently used RNA-seq analysis methods using simulated datasets showed that its performance was comparable to or exceeded that of other methods in terms of type I error rate, statistical power, receiver operating characteristics (ROC) curve, area under the ROC curve, precision-recall curve, and the ability to detect the number of differentially expressed genes, especially in situations involving underdispersion. These results were validated by quantitative real-time polymerase chain reaction using a real Foxtail dataset. These findings demonstrated DREAMSeq as a reliable, robust, and powerful new method for RNA-seq data mining.
eBL-regulated Foxtail millet-root DEGs identified by different methods
(A) Bar plot showing the number of eBL-regulated DEGs identified by DREAMSeq.Mix, edgeR, DESeq, and DESeq2. (B,C) Venn diagrams showing the overlap among the collections of eBL-regulated DEGs identified by DREAMSeq.Mix, edgeR, DESeq, and DESeq2 in non-underdispersion (B) and underdispersion (C) scenarios. nud, non-underdispersion; ud, underdispersion.
Availability – The DREAMSeq R package is available at http://tanglab.hebtu.edu.cn/tanglab/Home/DREAMSeq.