I am following the guide found here (http://www-huber.embl.de/users/klaus/Teaching/DESeq2Predoc2014.html) to estimate differential gene expression in my two-condition RNAseq dataset. When I look at my histogram of observed p-value statistics I end up with a plot like this:
dds <- nbinomWaldTest(dds)
ddsRes <- results(dds, pAdjustMethod = "BH")
ddsRes <- ddsRes[ !is.na(ddsRes$padj), ]
ddsRes <- ddsRes[ !is.na(ddsRes$pvalue), ]
table(ddsRes[,"padj"] < 0.1) #240 TRUE 19923 FALSE
hist(ddsRes$pvalue, col = "lavender", main = "TRT vs NoTRT", xlab = "p-values")
When I run fdrtool on this dataset the estimated empirical sigma is 1.176:
FDR.ddsRes <- fdrtool(ddsRes$stat, statistic= "normal", plot = T)
ddsRes[,"padj"] <- p.adjust(FDR.ddsRes$pval, method = "BH")
table(ddsRes[,"padj"] < 0.1) #27 TRUE 20136 FALSE
When I look at the number of rejected hypotheses after running the fdrtool procedure the number is much smaller (27 rejections compared to 240 using the BH FDR adjustment directly).
1) is it appropriate to use the fdrtool procedure when sigma is estimated to be >1? All of the examples I've seen thus far (including in the DESeq tutorials) focus on the case where sigma <1.
2) if the answer to 1) is yes, what are some of the assumptions about the experiment and results that would make a sigma >1 need to be controlled with the more conservative (in this case) fdrtool correction?
Any help would be greatly appreciated!