18 months ago by
This is an answer I posted to Biostars a while ago:
We really tried to write the main text of the paper such that it would be understood by non-statisticians. that said, I'll try to do it in a few sentences. before further questions on details please at least *try* to read the paper :)
Let's say we want to compare counts between two groups. We build a model for the observed counts. This model has some parameters: (1) a normalization parameter, for differences in library size at least, or it can be extended by other software; (2) a variance parameter, called dispersion; (3) parameters representing the group differences. Fit (1) using the same method from the original DESeq. Fit (2) in two steps: first find the value of the parameter that makes the likelihood largest, which is called maximum likelihood estimation. Look at all the values from all of the genes and move these values towards a middle value. Bayes theorem guides the amount of movement for each gene: if the information for the gene is low, the value is moved more to the middle, if the information for the gene is high, the value is moved very little. Fit (3) using the same technique as used for (2). The values for (3) are a useful final product, as are sets of genes where the group differences are likely to be above a threshold (zero or otherwise). These sets are defined by their false discovery rate.
Let me know if this helps or if you have further questions (you can reply with comment here on Bioconductor support site).