I have a rather general question about how and why Bayesian inference have been applied in limma, edgeR, and DESeq1-2.
In the simplest differential expression study there are two regression coefficients, an intercept and slope, the latter also denoted by log fold change. LFC estimates obtained across the features (genes) are quite noisy and are likely to benefit from shrinkage. In early 2000s there were enough general purpose shrinkage methods, such as Ridge Regression, that could help with that directly or at least provide some methodological clues.
However, LFC shrinkage was more or less ignored in limma, edgeR, and DESeq between 2004 and 2013 until it was added to DESeq2 in 2014. DESeq2 paper (Love et al, 2014) mentions three other Bayesian papers that moderate LFC, but they are all dated 2012-2013. Between 2004 and 2012, shrinkage was applied only to the 2nd order (variance) parameters, and my question is why.
The first version of limma (Smyth, 2004) provides an answer but it's very partial. In that paper they proposed a two component prior for LFC: LFC is equal to zero with probability p > 0 and is N(0, sigma) otherwise. The posterior odds statistic shrinks both LFC and 2nd order parameters. It then turned out that there are technical issues with estimating p and sigma from the data, so what we end up using in practice is the “moderated t-statistic” that doesn't actually moderate LFC.
What happened next was rather puzzling. Instead of using a two component model, in 2014 Love et al proposed a straightforward two stage procedure: first, variance parameters get moderated. Second, assuming the variance parameters are known, apply something quite similar to Ridge regression where the prior for LFC is N(0, sigma), without the zero component. Apparently it worked fine in practice, so my question is why something like that was never considered before, especially in limma that is based on a simple linear model.
I can speculate that the two component prior for LFC seemed far more adequate to Smyth than N(0, sigma) prior, so, after the former didn't work out he decided not to pursue LFC moderation with the latter. Another possibility is that DESeq2 might have some estimation issues similar to that of posterior odds in Smith, 2004. Please let me know what you think.