DESeq2 and fdrtool: applying FDR correction when the empirical sigma is greater than 1
1
1
Entering edit mode
dhibar ▴ 60
@dhibar-10079
Last seen 4.1 years ago

Hello,

I am following the guide found here (http://www-huber.embl.de/users/klaus/Teaching/DESeq2Predoc2014.html) to estimate differential gene expression in my two-condition RNAseq dataset. When I look at my histogram of observed p-value statistics I end up with a plot like this:

dds <-  nbinomWaldTest(dds)
ddsRes <- results(dds, pAdjustMethod = "BH")
ddsRes <- ddsRes[ !is.na(ddsRes$padj), ]
ddsRes <- ddsRes[ !is.na(ddsRes$pvalue), ]
table(ddsRes[,"padj"] < 0.1) #240 TRUE 19923 FALSE
hist(ddsRes$pvalue, col = "lavender", main = "TRT vs NoTRT", xlab = "p-values")

 

 

When I run fdrtool on this dataset the estimated empirical sigma is 1.176:

FDR.ddsRes <- fdrtool(ddsRes$stat, statistic= "normal", plot = T)
ddsRes[,"padj"]  <- p.adjust(FDR.ddsRes$pval, method = "BH")
table(ddsRes[,"padj"] < 0.1) #27 TRUE 20136 FALSE

When I look at the number of rejected hypotheses after running the fdrtool procedure the number is much smaller (27 rejections compared to 240 using the BH FDR adjustment directly).

I'm wondering:

1) is it appropriate to use the fdrtool procedure when sigma is estimated to be >1? All of the examples I've seen thus far (including in the DESeq tutorials) focus on the case where sigma <1.

2) if the answer to 1) is yes, what are some of the assumptions about the experiment and results that would make a sigma >1 need to be controlled with the more conservative (in this case) fdrtool correction?

Any help would be greatly appreciated!

deseq2 fdrtool • 1.9k views
ADD COMMENT
1
Entering edit mode
Bernd Klaus ▴ 590
@bernd-klaus-6281
Last seen 2.7 years ago
Germany

Hi,

thanks for your feedback!

1.) yes, it is appropriate to use fdrtool if sigma > 1. After all, the null model variance can also be underestimated. That being said, keep in mind that the sigma is an *estimate* and judging your p-value histogram, it seems that the original p-vlaue might just be little bit to "liberal" (you seem to have a slight enrichment of low p-values), so you can just keep them as they are. It is not a "must", I only applied emprical null modelling in the example because the p-values clearly had a strange shape.

You can also try to use the fdrtool qvalues (= FDR values) directly. What does

table(FDR.ddsRes$qval < 0.1) give you ?

Also try locfdr maybe, (https://cran.r-project.org/web/packages/locfdr) ,

which sigma  does locfdr(ddsRes$stat) give you?

 

2.) Basically, if fdrtool says that your actual sigma is greater than one, it means that your p-values are  a bit too low on average, so that you might have a higher proportion of false positives than the nominal FDR suggests. You also see this in the histogram, apart from peak at zero, the bars up to a pval of ~ 0.2 are maybe a bit too high be truly uniform.

Hope that helps,

Bernd

 

ADD COMMENT

Login before adding your answer.

Traffic: 314 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6