I have a rather general question about how and why Bayesian inference have been applied in limma, edgeR, and DESeq1-2.
In the simplest differential expression study there are two regression coefficients, an intercept and slope, the latter also denoted by log fold change. LFC estimates obtained across the features (genes) are quite noisy and are likely to benefit from shrinkage. In early 2000s there were enough general purpose shrinkage methods, such as Ridge Regression, that could help with that directly or at least provide some methodological clues.
However, LFC shrinkage was more or less ignored in limma, edgeR, and DESeq between 2004 and 2013 until it was added to DESeq2 in 2014. DESeq2 paper (Love et al, 2014) mentions three other Bayesian papers that moderate LFC, but they are all dated 2012-2013. Between 2004 and 2012, shrinkage was applied only to the 2nd order (variance) parameters, and my question is why.
The first version of limma (Smyth, 2004) provides an answer but it's very partial. In that paper they proposed a two component prior for LFC: LFC is equal to zero with probability p > 0 and is N(0, sigma) otherwise. The posterior odds statistic shrinks both LFC and 2nd order parameters. It then turned out that there are technical issues with estimating p and sigma from the data, so what we end up using in practice is the “moderated t-statistic” that doesn't actually moderate LFC.
What happened next was rather puzzling. Instead of using a two component model, in 2014 Love et al proposed a straightforward two stage procedure: first, variance parameters get moderated. Second, assuming the variance parameters are known, apply something quite similar to Ridge regression where the prior for LFC is N(0, sigma), without the zero component. Apparently it worked fine in practice, so my question is why something like that was never considered before, especially in limma that is based on a simple linear model.
I can speculate that the two component prior for LFC seemed far more adequate to Smyth than N(0, sigma) prior, so, after the former didn't work out he decided not to pursue LFC moderation with the latter. Another possibility is that DESeq2 might have some estimation issues similar to that of posterior odds in Smith, 2004. Please let me know what you think.
This isn't so much an answer, but a comment. While I of course think a Bayesian posterior LFC is a useful statistic, it's not straightforward to implement. Figuring out an algorithm to output a reasonable scale of the prior is not trivial. The shrinkage estimator for DESeq2 is always biased toward 0, by definition, but for some datasets, there is too much bias, and we're working on fixing this. For example, the RNA-seq mixology paper and dataset showed too much bias for DESeq2. It's definitely ongoing work, and we should have some new shrinkage methods in place that can be used via
lfcShrink(it looks like not in this devel cycle, but the next one because I want to have enough time for testing before it's user facing).
Another comment, we do reference, in DESeq2 Methods section, microarray methods which proposed moderated fold changes, e.g. Newton et al 2001. This section starts with: "As was observed with differential expression analysis using microarrays, genes with low intensity values tend to suffer from a small signal-to-noise ratio..."
Given what Ryan said below, does lfcShrink follow the suggestion I posted here:
C: Interactions in DESeq2
No, we won't be applying priors to contrasts of coefficients, just to coefficients (either all non-intercept or individually).