If I change it to nbins = "auto", then I get the reduction to BH message:
Only 1 bin; IHW reduces to Benjamini Hochberg (uniform weights)
My question is how did you arrive at nbins=4? And how should I go about setting nbins for my own similar proteomics data where nbins = "auto" also reduces to BH?
it's a good question. As is so often the case, this is just a bias/variance/computation time tradeoff. Here by bias-variance I refer to the estimation of the weight function from the rest of the folds and applying it to the held-out fold. In general, in the default choices in the IHW package we have opted for a more conservative route that will often shrink weights towards uniform, thus recovering results equal or very close to those of Benjamini-Hochberg. It is however important to note that this affects power, not FDR control.
By default (choice "auto") it is required that at least 1500 p-values are present in every bin. Through experience this leads to a good estimation of the underlying distribution. However, you might argue that even a somewhat noisier estimate of the distribution could lead to good estimation of the weight function, especially since we add a total variation ("fused lasso") type penalty. And this is indeed true -- but requires a good choice of the regularization parameter. To get a better handle on this we need to do cross validation nested within the fold splitting (specified by `nsplits_interal`). However, this increases computation time by quite a bit, which is why it is only done once by default (rather than e.g. 5 as in your example).
In any case, to answer your question: If you have a relatively small multiple testing situation (such as the proteomics example with only 2666 hypotheses), but still want to apply IHW, then you can use a larger number of bins than what is set by default. If you do this, I strongly encourage you to also increase `nsplits_internal`. Finally, I would not recommend using bins with less than 600 hypotheses or so.