Question: DESeq2 produces adjusted p-values = 1
gravatar for Assa Yeroslaviz
2.4 years ago by
Assa Yeroslaviz1.3k
Munich, Germany
Assa Yeroslaviz1.3k wrote:



I'm working on a mouse data set, where the mice were exposed to different treatments (HP, DR). In the experiment we have 9 conditions with 3 (or 4) samples per condition. I have one "zero" control and two controls after pre-conditioning of the mice. I also have two time points were RNASeq data was prepared (4h and 24h). we did both RNASeq and miRNA-Seq.

We are running multiple comparisons and all looks well, except one comparisonin the miRNASeq data set. In this comparison I get an adjusted p-value of 1 for all the genes. The comparison is between the zero control and one of the controls after pre-conditioning. In the RNASeq data we get good results with quite a few significant genes under the threshold adj. p-value.

I know it can't be directly compared, but it is a comparison of the workflow which I would like to point to.

I would like to know, what can be the explanation for an adjusted p-value of 1 in all genes, when the p-value and log2FC looks quite normal.

my DESeq2 version is 1.8.1





ADD COMMENTlink modified 2.4 years ago by Steve Lianoglou12k • written 2.4 years ago by Assa Yeroslaviz1.3k

Hi Assa. Are you trying to compare samples without replicates? This happens in this case.. Pvalues without replicates doesn't makes much sense anyway.

ADD REPLYlink written 2.4 years ago by Vivek.b40

No I have for each condition either 3 or 4 replicates

ADD REPLYlink written 2.4 years ago by Assa Yeroslaviz1.3k
gravatar for Steve Lianoglou
2.4 years ago by
Steve Lianoglou12k wrote:

It's a property of the Benjamini & Hochberg correction, and doesn't have anything to do with DESeq2 per se, as this is just working on the pvalues it generates. When you don't have some type of enrichment of small/significant pvalues -- which is to say when it looks like there really isn't much of anything that's significant -- the adjusted pvalues get hammered hard (and also exhibit this discretized behavior).

You said in your post that "... the p-value and log2FC looks quite normal," but what does that mean? Take a look at a histogram of the pvalues you have in this comparison, how does it look? I'm guessing there's no small "bump" towards the low pvalue side? Maybe even a relative paucity of pvalues on the left/significant side?

We can also show that you get something that approximates this behavior when we generate a vector of pvalues you would get under the null (ie. no significant results) -- in this case, the p-values would be uniformly distributed between [0,1]. Let's say there are 500 miRNA's you are testing here and none of them are really differentially expressed:

pval <- runif(500)
padj <- p.adjust(pval, 'BH')

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.4832  0.9131  0.9131  0.9408  0.9855  0.9961

## Generate a slightly hosed pval distribution where we generate pvalues
## from [0.05, 1], and you see this discretization of adjust pvalues even more
pval2 <- runif(500, 0.05)
padj2 <- p.adjust(pval2, 'BH')

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.9988  0.9988  0.9988  0.9989  0.9988  0.9999

Hope that helps

ADD COMMENTlink written 2.4 years ago by Steve Lianoglou12k

Yes, Steve is correct.

Here is a comment I posted a while ago which uses a plot to show how you get repeated values:

C: adj.P metod = \"BH\" many exact the same value

In your case those values are 1, meaning no genes survived multiple test correction.

ADD REPLYlink written 2.4 years ago by Michael Love13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 104 users visited in the last hour