Question: Limma toptable interpretation check
gravatar for bhgyu
13 months ago by
bhgyu30 wrote:

Hello there, 

I am a little confused by the toptable output (well really the FDR) and thought I'd quickly check my interpretation. 

For the following code:

topall <- toptable(fit, coef="CaseAD", genelist=fit$genes, number=Inf, adjust="fdr", p.value=0.01, lfc=log2(1))

Would the interpretation be:

Using a threshold of 0.05 for FDR adj. p.values and a minimal fold change of 2, this analysis produces n number of genes?

And does the above mean that I would look at the adjusted p.values? 

I am interested in following MAQC guidelines for this analysis, so would rather restrict my results to a FDR = 0.05 and p.values < 0.01. However, I am not entirely sure, how to get that, since I am not sure what the "adjust=fdr" relates to here? 

Any advice?

Many thanks!!!


ADD COMMENTlink modified 13 months ago by Aaron Lun20k • written 13 months ago by bhgyu30
gravatar for Aaron Lun
13 months ago by
Aaron Lun20k
Cambridge, United Kingdom
Aaron Lun20k wrote:

Your topTable call is asking for all genes that have FDR below 0.01 and log-fold changes greater than log2(1), i.e., zero. This effectively means that you're only selecting genes based on the FDR. Which is totally fine, because you should always be using adjusted p-values for genomic analyses.

I don't know what the MAQC guidelines say, but setting thresholds on both the FDR and p-values seems pointless to me; the former is calculated from the latter, so they'll be redundant. I would recommend just setting a FDR threshold (usually 1 or 5%) and defining all DE genes as those with FDR values below that threshold.

If you want to include a log-fold change threshold, use treat rather than setting it in topTable; see the warnings at the bottom of the documentation for ?topTable.

ADD COMMENTlink modified 13 months ago • written 13 months ago by Aaron Lun20k

Thank you!

Just to clarify: Does that mean my lfc is redundant? 

Does that also mean you should plot adjusted p.values instead of p.values? 


ADD REPLYlink written 13 months ago by bhgyu30

Yes, your log-fold change threshold is redundant in your original post.

For plotting purposes, it's better to use the -log10(p.value), as the enforced monotonicity of the BH correction leads to stretches of genes with the same FDR value, which is not aesthetically pleasing. (For plotting only; calling of DE genes is still done with the FDR.) If you want to visualize the threshold of significance, simply use the largest p-value corresponding to a FDR that is below your threshold. Also check out volcanoplot if that's what you intend to make.

ADD REPLYlink modified 13 months ago • written 13 months ago by Aaron Lun20k

Thanks Aaron. 

ADD REPLYlink written 13 months ago by bhgyu30

Just to clarify, your lfc is redundant because you set lfc = log2(1) = 0, which is the default and imposes no fold change thresholds on your results, so you are only selecting based on adj. p-values. A fold change of 2 as you indicated would be lfc = 1, but see warning in ?topTable on setting p-val and lfc cutoffs simultaneously, see ?treat and ?topTreat for fold change thresholding.

ADD REPLYlink modified 11 months ago • written 12 months ago by R.S.0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 157 users visited in the last hour