Question: DESeq2 produces adjusted p-values = 1
0
4.4 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

Hi,

I'm working on a mouse data set, where the mice were exposed to different treatments (HP, DR). In the experiment we have 9 conditions with 3 (or 4) samples per condition. I have one "zero" control and two controls after pre-conditioning of the mice. I also have two time points were RNASeq data was prepared (4h and 24h). we did both RNASeq and miRNA-Seq.

We are running multiple comparisons and all looks well, except one comparisonin the miRNASeq data set. In this comparison I get an adjusted p-value of 1 for all the genes. The comparison is between the zero control and one of the controls after pre-conditioning. In the RNASeq data we get good results with quite a few significant genes under the threshold adj. p-value.

I know it can't be directly compared, but it is a comparison of the workflow which I would like to point to.

I would like to know, what can be the explanation for an adjusted p-value of 1 in all genes, when the p-value and log2FC looks quite normal.

my DESeq2 version is 1.8.1

thanks,

Assa

deseq2 adjusted pvalue • 4.8k views
modified 4.4 years ago by Steve Lianoglou12k • written 4.4 years ago by Assa Yeroslaviz1.4k

Hi Assa. Are you trying to compare samples without replicates? This happens in this case.. Pvalues without replicates doesn't makes much sense anyway.

No I have for each condition either 3 or 4 replicates

4
4.4 years ago by
Denali
Steve Lianoglou12k wrote:

It's a property of the Benjamini & Hochberg correction, and doesn't have anything to do with DESeq2 per se, as this is just working on the pvalues it generates. When you don't have some type of enrichment of small/significant pvalues -- which is to say when it looks like there really isn't much of anything that's significant -- the adjusted pvalues get hammered hard (and also exhibit this discretized behavior).

You said in your post that "... the p-value and log2FC looks quite normal," but what does that mean? Take a look at a histogram of the pvalues you have in this comparison, how does it look? I'm guessing there's no small "bump" towards the low pvalue side? Maybe even a relative paucity of pvalues on the left/significant side?

We can also show that you get something that approximates this behavior when we generate a vector of pvalues you would get under the null (ie. no significant results) -- in this case, the p-values would be uniformly distributed between [0,1]. Let's say there are 500 miRNA's you are testing here and none of them are really differentially expressed:

set.seed(2/3)
pval <- runif(500)

Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.4832  0.9131  0.9131  0.9408  0.9855  0.9961

## Generate a slightly hosed pval distribution where we generate pvalues
## from [0.05, 1], and you see this discretization of adjust pvalues even more
pval2 <- runif(500, 0.05)

Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.9988  0.9988  0.9988  0.9989  0.9988  0.9999

Hope that helps

1

Yes, Steve is correct.

Here is a comment I posted a while ago which uses a plot to show how you get repeated values:

C: adj.P metod = \"BH\" many exact the same value

In your case those values are 1, meaning no genes survived multiple test correction.