Department of Biostatistics
University of Michigan
Statistical modeling of RNA sequencing data
Ultra high-throughput sequencing of transcriptomes (RNA-Seq) has recently become a widely used technique for measuring gene expression due to its decreasing cost, wide dynamic range for detection and accurate measurement of transcript abundance. However, systematic biases introduced during the sequencing and read mapping processes as well as incompleteness of the transcript annotation databases may cause the estimates of transcript abundance to be unreliable. Furthermore, the nature of RNA-Seq makes it nearly impossible to provide absolute measurements of transcript abundance. In this talk, we will introduce some statistical approaches for modeling RNA-Seq data, which include statistical models for robust estimation of transcript abundance, and for joint detection of differential expression and normalization of RNA-Seq data. Optimization techniques for fitting these statistical models will be discussed. These statistical methods have potential uses in applications beyond genomics.
Originally published at acms.nd.edu.