Question: Why the p value and logFC calculated by limma is so small
0
5 months ago by
xingxd160
xingxd160 wrote:

Hi all

• I use limma to do the differential expression analysis, I want to do the volcano plot, x-axis is logFC and y-axis is -1*log10(adjust p). But I found the the adjust p output by limma is so small and after -log10 transfor is up to 300, thats mean the p value gived by limma is 1*E-300 unit. But in other paper , the adjust p value is about 1*E-30 unit, and -log10 is about 30. Why the p value is so small in limma? How can I change p value to 1*E-30 unit.
• The logFC is small too. I try to calculate the mean expression for each genes in disease samples and mean value in normal value, then get the logFC, its larger than what limma output . And if I use log2(2) threshold , there is nearly no genes significant express. I have to use log2(1.5) even log2(1.2). Why there are so many genes are so signaficant (with very low p value) but also with a very small logFC at the same times? Can I still consider this genes are differentrial expression(depend on adjust p) even they with small logFC ?

Best

limma volcanoplot logfc pvalue • 200 views
modified 5 months ago by Gordon Smyth38k • written 5 months ago by xingxd160
Answer: Why the p value and logFC calculated by limma is so small
1
5 months ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

You haven't told us anything about your data or analysis, so it's impossible to say why you get the results that you do.

If your data has a huge number of replicates, then it would be natural to get small p-values even when the fold changes are relatively modest.

The limma volcanoplot function uses -log10(p-value) for the y-axis, not -log10(adjust p).

Yes , you are right ! I use limma to do my single cell analysis. The cells in one condition is about 2000+ , the cells in another condition is aboult 3500+ . I just use limma to compare this two conditions follow the "lmFit" and "eBayes" steps .

Yes, that is a huge number of replicates from limma's point of view. It is to be expected that some of the p-values will be very small.

If you use limma for scRNA-seq, it is very important that you only keep genes in the analysis that are detected in a reasonable number of cells. In your case, that might be a few hundred cells.

Good advice ! I should fitler the genes that only express in little percentage of the cells first ! So , you mean that p value and logFC are too samll are reasonable with huge number of replicates ? And can I lower the logFC threshold to identify significant expression genes , such as from log2(2) to log2(1.2) ?