Question: Limma toptable interpretation check
gravatar for bhgyu
10 months ago by
bhgyu30 wrote:

Hello there, 

I am a little confused by the toptable output (well really the FDR) and thought I'd quickly check my interpretation. 

For the following code:

topall <- toptable(fit, coef="CaseAD", genelist=fit$genes, number=Inf, adjust="fdr", p.value=0.01, lfc=log2(1))

Would the interpretation be:

Using a threshold of 0.05 for FDR adj. p.values and a minimal fold change of 2, this analysis produces n number of genes?

And does the above mean that I would look at the adjusted p.values? 

I am interested in following MAQC guidelines for this analysis, so would rather restrict my results to a FDR = 0.05 and p.values < 0.01. However, I am not entirely sure, how to get that, since I am not sure what the "adjust=fdr" relates to here? 

Any advice?

Many thanks!!!


ADD COMMENTlink modified 10 months ago by Aaron Lun19k • written 10 months ago by bhgyu30
gravatar for Aaron Lun
10 months ago by
Aaron Lun19k
Cambridge, United Kingdom
Aaron Lun19k wrote:

Your topTable call is asking for all genes that have FDR below 0.01 and log-fold changes greater than log2(1), i.e., zero. This effectively means that you're only selecting genes based on the FDR. Which is totally fine, because you should always be using adjusted p-values for genomic analyses.

I don't know what the MAQC guidelines say, but setting thresholds on both the FDR and p-values seems pointless to me; the former is calculated from the latter, so they'll be redundant. I would recommend just setting a FDR threshold (usually 1 or 5%) and defining all DE genes as those with FDR values below that threshold.

If you want to include a log-fold change threshold, use treat rather than setting it in topTable; see the warnings at the bottom of the documentation for ?topTable.

ADD COMMENTlink modified 10 months ago • written 10 months ago by Aaron Lun19k

Thank you!

Just to clarify: Does that mean my lfc is redundant? 

Does that also mean you should plot adjusted p.values instead of p.values? 


ADD REPLYlink written 10 months ago by bhgyu30

Yes, your log-fold change threshold is redundant in your original post.

For plotting purposes, it's better to use the -log10(p.value), as the enforced monotonicity of the BH correction leads to stretches of genes with the same FDR value, which is not aesthetically pleasing. (For plotting only; calling of DE genes is still done with the FDR.) If you want to visualize the threshold of significance, simply use the largest p-value corresponding to a FDR that is below your threshold. Also check out volcanoplot if that's what you intend to make.

ADD REPLYlink modified 10 months ago • written 10 months ago by Aaron Lun19k

Thanks Aaron. 

ADD REPLYlink written 10 months ago by bhgyu30

Just to clarify, your lfc is redundant because you set lfc = log2(1) = 0, which is the default and imposes no fold change thresholds on your results, so you are only selecting based on adj. p-values. A fold change of 2 as you indicated would be lfc = 1, but see warning in ?topTable on setting p-val and lfc cutoffs simultaneously, see ?treat and ?topTreat for fold change thresholding.

ADD REPLYlink modified 8 months ago • written 9 months ago by R.S.0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 317 users visited in the last hour