By measuring messenger RNA levels for all genes in a sample, RNA-seq provides an attractive option to characterize the global changes in transcription. RNA-seq is becoming the widely used platform for gene expression profiling. However, real transcription signals in the RNA-seq data are confounded with measurement and sequencing errors and other random biological/technical variation.
To extract biologically useful transcription process from the RNA-seq data, researchers from the UTHSCH propose to use the second ODE for modeling the RNA-seq data. They use differential principal analysis to develop statistical methods for estimation of location-varying coefficients of the ODE. They validate the accuracy of the ODE model to fit the RNA-seq data by prediction analysis and 5-fold cross validation. To further evaluate the performance of the ODE model for RNA-seq data analysis, the researchers used the location-varying coefficients of the second ODE as features to classify the normal and tumor cells. They demonstrate that even using the ODE model for single gene they can achieve high classification accuracy. They also conduct response analysis to investigate how the transcription process responds to the perturbation of the external signals and identify dozens of genes that are related to cancer.
(a) Circular phylogram tree of 19717 genes that were clustered into nine groups by Dendroscope 3.2.10. (b) Detailed circular phylogram tree of 19717 genes that were clustered into nine groups by Dendroscope 3.2.10.