Unexpected p-values by DESeq2
1
0
Entering edit mode
@xianglongruoying-20434
Last seen 5.0 years ago

Dear Community,

I have a question about the p-values reported by DESeq2. I performed differential expression analysis between cases and controls. The raw p-value reported for gene A with the following FPKM values is only 1.95E-04: case group: 1.95, 1.84, 0, 0, 0, 0, 0.01, 0.28, 0, 0.01, 0, 0, 0 control group: 63.23, 81.76, 75.39, 57.81, 44.48, 67.62, 51.98, 38.09, 80.06, 46.84, 90.77, 81.71, 64.62, 74.59

But the raw p-value for gene B with the following FPKM values is 9.97E-30, which is much more significant: case group: 19.33, 28.04, 23.6, 24.74, 23.5, 24.75, 17.92, 23.05, 16.72, 22.5, 25.94, 19.36, 20.3 control group: 38.71, 37.73, 36.04, 36.44, 53.25, 35.3, 34.58, 33.22, 46.12, 34.23, 43.95, 38.55, 35.11, 44.82

I know DESeq2 takes raw read count as input and I did use read counts for differential expression analysis. However, the normalized count by DESeq2 for these two genes follow the same pattern as the FPKM values.

What I don't understand is gene A should have much more significant p-value than gene B as gene A has almost no expression in cases, but apparently DESeq2 didn't report this way. Using shrinkage or not doesn't seem to matter as I tried both.

I would appreciate your help for any explanation on this.

Thank you so much!!!

deseq2 normalization • 685 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Take a look at plotCounts() for these genes. This may help you visualize the results.

ADD COMMENT
0
Entering edit mode

Hi Michael,

Thank you for your reply. I did take a look at plotCounts(). Here is the plot for gene A: https://ibb.co/xD0Y7Hv

Here is the plot for gene B: https://ibb.co/KqGvqC9

I still don't get it. To me, gene A should have smaller p value.

Thanks for your time!!!

ADD REPLY
0
Entering edit mode

Hard to say. In the end I focus on FDR sets and LFC rather than pvalues (see DESeq2 paper or apeglm paper for discussion). So I’m not concerned very much with tiny vs very very tiny pvalue.

ADD REPLY
0
Entering edit mode

Hi Michael,

Thanks again for your reply. The FDR for gene A is still bigger than gene B, of course. I agree with you that as long as both genes are significant after FDR, it's no big deal. But I was still wondering if there is an explanation for this discrepancy. When we showed the plotCounts figures and the p value for these two genes in the manuscript, reviewers questioned our analysis.

Thanks so much for your time.

ADD REPLY
0
Entering edit mode

There are many aspects that go into the SE for an LFC which is what drives the Wald test. The level of the count for both groups and the within group dispersion are factors. The gene with the smaller pvalue has lower dispersion I think. If you used a LRT the pvalues may be closer to each other.

ADD REPLY
0
Entering edit mode

Hi Michael,

Thank you for your reply again.

I tried LRT and the p values for the two genes did not become further. For gene A, the p value is 0.0064. For gene B, the p value is 2.55E-29. Are there other parameters I could set when I run DESeq so that I can get the same level of p value for these two genes?

ADD REPLY
0
Entering edit mode

No I don’t think so. It is what it is.

ADD REPLY

Login before adding your answer.

Traffic: 511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6