Hello,

I am following the guide found here (http://www-huber.embl.de/users/klaus/Teaching/DESeq2Predoc2014.html) to estimate differential gene expression in my two-condition RNAseq dataset. When I look at my histogram of observed p-value statistics I end up with a plot like this:

dds <- nbinomWaldTest(dds)

ddsRes <- results(dds, pAdjustMethod = "BH")

ddsRes <- ddsRes[ !is.na(ddsRes$padj), ]

ddsRes <- ddsRes[ !is.na(ddsRes$pvalue), ]

table(ddsRes[,"padj"] < 0.1) #240 TRUE 19923 FALSE

hist(ddsRes$pvalue, col = "lavender", main = "TRT vs NoTRT", xlab = "p-values")

When I run fdrtool on this dataset the estimated empirical sigma is 1.176:

FDR.ddsRes <- fdrtool(ddsRes$stat, statistic= "normal", plot = T)

ddsRes[,"padj"] <- p.adjust(FDR.ddsRes$pval, method = "BH")

table(ddsRes[,"padj"] < 0.1) #27 TRUE 20136 FALSE

When I look at the number of rejected hypotheses after running the fdrtool procedure the number is much smaller (27 rejections compared to 240 using the BH FDR adjustment directly).

I'm wondering:

1) is it appropriate to use the fdrtool procedure when sigma is estimated to be >1? All of the examples I've seen thus far (including in the DESeq tutorials) focus on the case where sigma <1.

2) if the answer to 1) is yes, what are some of the assumptions about the experiment and results that would make a sigma >1 need to be controlled with the more conservative (in this case) fdrtool correction?

Any help would be greatly appreciated!