Question: Volcanoplot with limma - RAW P-values or Adj.P-Values
gravatar for thyagoleal
12 months ago by
thyagoleal20 wrote:

I have noticed that limma's volcanoplot() function uses uncorrected p-values from the MArrayLM objected. My question is: why?

I've seen an old post where G. Smyth  mentioned that the FDR-corrected p-values loses some info in comparison to the raw ones. Could someone elucidate this, please? Another reason pointed by the author was that the same adj.p-value may match to different p-values. 



ADD COMMENTlink modified 12 months ago by Wolfgang Huber13k • written 12 months ago by thyagoleal20
gravatar for Gordon Smyth
12 months ago by
Gordon Smyth34k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth34k wrote:

I'm not sure what I can tell you that I didn't already say in my earlier answer to a similar question: Volcano plot labeling troubles

You've already repeated in your question the reason why it it preferable to use p-value as the y-axis rather than FDR. (Actually I like B-statistic even better, but that's another story.) The p-values are the basic values from which FDR is computed and it is typically better to plot basic data rather than derived quantities.

Why does that not convince you?  Why would you want to force points with different p-values together on the y-axis? Or are you asking for more explanation of why different p-values can lead to the same FDR? I think that has been answered separately.

Note that there is always a p-value cutoff that corresponds to any FDR cutoff, so you can easily indicate an FDR cutoff on the plot even if the y-axis is p-value. So using FDR as the y-axis has no advantage that I can think of.

ADD COMMENTlink modified 12 months ago • written 12 months ago by Gordon Smyth34k

Thanks for your answer. I'm not questioning your decision by making the way you did it, though. I only asked because I wanted to know exactly why, since a lot of people often question me this. Anyway, thank you again.



ADD REPLYlink written 12 months ago by thyagoleal20
gravatar for Wolfgang Huber
12 months ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:

There's another reason to support Gordon's view. There is a fundamental difference between p-values and FDR: p-values are per-hypothesis (i.e., per-gene) properties, whereas FDR is an average across all rejected hypotheses. I.e., if you have a set of hypotheses (genes) rejected at a certain FDR $\alpha$, then the local fdr for some of these is less than $\alpha$, and for some, more than $\alpha$. The only thing you know is that the FDR overall is $\alpha$.

In general, there is no 1:1 relation between p-value and FDR. In the special case of the Benjamini-Hochberg method, such a 1:1 relation can be constructed (what's called the 'adjusted p-value'), but this assumes that the Benjamini-Hochberg method is used, with no modifications such as filtering, weighting, etc.

This assumption has seemed so natural that often it has not even been questioned (hence the popularity of the 'adjusted p-value' terminology), but in fact is not natural if there is heterogeneity between the tests, e.g., if we know that some tests have more power than others, or some have a higher prior probability of being null than others.

For these reasons, the p-value and not the adjusted p-value is the preferable quantity to use in a volcano plot.

ADD COMMENTlink modified 12 months ago • written 12 months ago by Wolfgang Huber13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 354 users visited in the last hour