Too small unreasonable p-values from limma
2
0
Entering edit mode
Habil Zare ▴ 200
@habil-zare-7836
Last seen 5 months ago
United States/Austin Area

I am getting unreasonably too small p-values from limma in my analysis. See the following minimal example in which a random variable has similar distributions in two conditions. So we expect to get an insignificant p-value. However, this is not the limma's output.

n1 <- 100
topTable(eBayes(lmFit(c(1:n1, 1:n1), c(rep(0,n1),rep(1,n1)))), num=Inf, coef=1)

The p-value in the output is very significant:

logFC AveExpr        t      P.Value    adj.P.Val        B
1  50.5    50.5 10.97056 3.320846e-22 3.320846e-22 12.81377

I am using limma Version 3.40.6.

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] limma_3.40.6

loaded via a namespace (and not attached):
[1] compiler_3.6.1
limma • 544 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 9 hours ago
United States

Did you notice that your logFC and average expression are identical? That's because the coefficient you are testing is the mean of one of the groups, and you are testing to see if the average of that group is different from zero (which it is, hence the exceedingly small p-value). You should probably spend more time reading the limma User's Guide, and maybe Julian Faraway's linear models book.

ADD COMMENT
0
Entering edit mode
Habil Zare ▴ 200
@habil-zare-7836
Last seen 5 months ago
United States/Austin Area

Thanks James for the references. When I add an intercept column to the design matrix, the p-values are not significant anymore as expected:

topTable(eBayes(lmFit(c(1:n1, 1:n1), cbind(1,c(rep(0,n1),rep(1,n1))))), num=Inf, coef=2)
logFC AveExpr        t P.Value adj.P.Val    B
1 3.97e-16     5.5 2.93e-16       1         1 -4.6

I still need to figure out how the design matrix and intercept determine what is being tested. It seems that with intercept, we are comparing the averages between groups, and without intercept we are testing whether average of groups are zero.

ADD COMMENT
0
Entering edit mode

If you want to make a comment to a post, click on the big green button that says ADD COMMENT. The Add Answer button is for adding answers, which is not what you have done.

And I agree that you need to figure out design matrices, which is why I told you that you need to spend more time reading the limma User's Guide, and possibly the Faraway book.

ADD REPLY

Login before adding your answer.

Traffic: 542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6