Question

Limma toptable interpretation check

0

Entering edit mode

bhgyu ▴ 30

@bhgyu-13069

Last seen 6.7 years ago

Hello there,

I am a little confused by the toptable output (well really the FDR) and thought I'd quickly check my interpretation.

For the following code:

topall <- toptable(fit, coef="CaseAD", genelist=fit$genes, number=Inf, adjust="fdr", p.value=0.01, lfc=log2(1))

Would the interpretation be:

Using a threshold of 0.05 for FDR adj. p.values and a minimal fold change of 2, this analysis produces n number of genes?

And does the above mean that I would look at the adjusted p.values?

I am interested in following MAQC guidelines for this analysis, so would rather restrict my results to a FDR = 0.05 and p.values < 0.01. However, I am not entirely sure, how to get that, since I am not sure what the "adjust=fdr" relates to here?

Any advice?

Many thanks!!!

limma microarray fdr p-value results • 4.1k views

ADD COMMENT • link updated 7.8 years ago by Aaron Lun ★ 28k • written 7.8 years ago by bhgyu ▴ 30

score 1 · Answer 1 · 2017-05-30

1

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 1 hour ago

The city by the bay

Your topTable call is asking for all genes that have FDR below 0.01 and log-fold changes greater than log2(1), i.e., zero. This effectively means that you're only selecting genes based on the FDR. Which is totally fine, because you should always be using adjusted p-values for genomic analyses.

I don't know what the MAQC guidelines say, but setting thresholds on both the FDR and p-values seems pointless to me; the former is calculated from the latter, so they'll be redundant. I would recommend just setting a FDR threshold (usually 1 or 5%) and defining all DE genes as those with FDR values below that threshold.

If you want to include a log-fold change threshold, use treat rather than setting it in topTable; see the warnings at the bottom of the documentation for ?topTable.

ADD COMMENT • link 7.8 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thank you!

Just to clarify: Does that mean my lfc is redundant?

Does that also mean you should plot adjusted p.values instead of p.values?

ADD REPLY • link 7.8 years ago bhgyu ▴ 30

1

Entering edit mode

Yes, your log-fold change threshold is redundant in your original post.

For plotting purposes, it's better to use the -log10(p.value), as the enforced monotonicity of the BH correction leads to stretches of genes with the same FDR value, which is not aesthetically pleasing. (For plotting only; calling of DE genes is still done with the FDR.) If you want to visualize the threshold of significance, simply use the largest p-value corresponding to a FDR that is below your threshold. Also check out volcanoplot if that's what you intend to make.

ADD REPLY • link 7.8 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thanks Aaron.

ADD REPLY • link 7.8 years ago bhgyu ▴ 30

0

Entering edit mode

Just to clarify, your lfc is redundant because you set lfc = log2(1) = 0, which is the default and imposes no fold change thresholds on your results, so you are only selecting based on adj. p-values. A fold change of 2 as you indicated would be lfc = 1, but see warning in ?topTable on setting p-val and lfc cutoffs simultaneously, see ?treat and ?topTreat for fold change thresholding.

ADD REPLY • link 7.7 years ago • updated 7.6 years ago R.S. • 0