Question: Limma toptable interpretation check
gravatar for bhgyu
18 months ago by
bhgyu30 wrote:

Hello there, 

I am a little confused by the toptable output (well really the FDR) and thought I'd quickly check my interpretation. 

For the following code:

topall <- toptable(fit, coef="CaseAD", genelist=fit$genes, number=Inf, adjust="fdr", p.value=0.01, lfc=log2(1))

Would the interpretation be:

Using a threshold of 0.05 for FDR adj. p.values and a minimal fold change of 2, this analysis produces n number of genes?

And does the above mean that I would look at the adjusted p.values? 

I am interested in following MAQC guidelines for this analysis, so would rather restrict my results to a FDR = 0.05 and p.values < 0.01. However, I am not entirely sure, how to get that, since I am not sure what the "adjust=fdr" relates to here? 

Any advice?

Many thanks!!!


ADD COMMENTlink modified 18 months ago by Aaron Lun21k • written 18 months ago by bhgyu30
gravatar for Aaron Lun
18 months ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

Your topTable call is asking for all genes that have FDR below 0.01 and log-fold changes greater than log2(1), i.e., zero. This effectively means that you're only selecting genes based on the FDR. Which is totally fine, because you should always be using adjusted p-values for genomic analyses.

I don't know what the MAQC guidelines say, but setting thresholds on both the FDR and p-values seems pointless to me; the former is calculated from the latter, so they'll be redundant. I would recommend just setting a FDR threshold (usually 1 or 5%) and defining all DE genes as those with FDR values below that threshold.

If you want to include a log-fold change threshold, use treat rather than setting it in topTable; see the warnings at the bottom of the documentation for ?topTable.

ADD COMMENTlink modified 18 months ago • written 18 months ago by Aaron Lun21k

Thank you!

Just to clarify: Does that mean my lfc is redundant? 

Does that also mean you should plot adjusted p.values instead of p.values? 


ADD REPLYlink written 18 months ago by bhgyu30

Yes, your log-fold change threshold is redundant in your original post.

For plotting purposes, it's better to use the -log10(p.value), as the enforced monotonicity of the BH correction leads to stretches of genes with the same FDR value, which is not aesthetically pleasing. (For plotting only; calling of DE genes is still done with the FDR.) If you want to visualize the threshold of significance, simply use the largest p-value corresponding to a FDR that is below your threshold. Also check out volcanoplot if that's what you intend to make.

ADD REPLYlink modified 18 months ago • written 18 months ago by Aaron Lun21k

Thanks Aaron. 

ADD REPLYlink written 18 months ago by bhgyu30

Just to clarify, your lfc is redundant because you set lfc = log2(1) = 0, which is the default and imposes no fold change thresholds on your results, so you are only selecting based on adj. p-values. A fold change of 2 as you indicated would be lfc = 1, but see warning in ?topTable on setting p-val and lfc cutoffs simultaneously, see ?treat and ?topTreat for fold change thresholding.

ADD REPLYlink modified 16 months ago • written 17 months ago by R.S.0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 362 users visited in the last hour