Motivation: High throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq), cell counting. Statistical inference of differential signal in these data needs to take into account their natural variability throughout the dynamic range. When the number of replicates is small, error modeling is needed to achieve statistical power.
Results: We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power.
Availability: A free open-source R/Biondonductor software package, called “DESeq”, is available from http://www-huber.embl.de/users/anders/DESeq
Anders S, Huber W. (2010) Differential expression analysis for sequence count data. Nature Proceedings [Epub ahead of print]. [abstract]