Discrepanices between minifi dmp finder and lmfit
Entering edit mode
Last seen 4.6 years ago

Dear all,

currently Im working on methylation analysis of Illumina 450k beadchip data using the minfi workflow for identifying differentially methylated CpG sites. The design consist of 55 samples and we have different groups of covariates and one group for example has 3 levels. When applying dmp finder we obtain different significantly methylated CpG sites compared to limfit by limma. I read that the minfi dmpfinder is a wrapper around lmfit. But why are the results so different then? Appreciate any help and hints for a solution.

Phenotype data

ID Ethnicity Age GroupSurgery
1 CEU 13 1
2 CEU 25 1
3 ASN 45 2
4 CEU 55 3
5 CEU 67 3


Command dmp finder:

dmps.cat <- dmpFinder(m, pheno=phenotype_data$GroupSurgery, type="categorical")


Command limma

group1 <- factor(phenotype_data$GroupSurgery, levels=c("1","2","3"))

design <- model.matrix(~group1)
fit <- lmFit(m,design)
fit <- eBayes(fit)
topTable(fit, coef=group1)


Results dmp finder

cpg intercept f pval qval
cg14829516 0.320442 28.70813 3.980690e-09 0.001369007
cg10099799 -2.407821 22.26578 1.034389e-07 0.016183809
cg20036732 -1.819726 21.48785 1.578203e-07 0.016183809
cg14587604 -3.087692 21.16708 1.882320e-07 0.016183809
cg07435215 1.033033 19.85411 3.921676e-07 0.026974223
cg01638741 2.113079 19.25624 5.516534e-07 0.027544259


Results lmfit

CpG X.Intercept group12 group13 AveExpr F pval adjPv
cg18560014 3.610993 -0.019218780 -0.01794563 3.600508 25462.76 6.652259e-89 1.515816e-83
cg06862343 3.526721 -0.006593399 0.04692367 3.537600 24921.88 1.224412e-88 1.515816e-83
cg16427743 3.134849 0.036864198 0.02867473 3.153394 24867.83 1.302324e-88 1.515816e-83
cg19973604 3.080761 0.063361587 0.04370996 3.111115 24611.21 1.748740e-88 1.515816e-83
cg07704672 3.457744 -0.012866272 -0.02403017 3.447448 24604.27 1.762806e-88 1.515816e-83
cg26673073 3.554504 -0.024529503 0.01566954 3.551641 24258.88 2.634272e-88 1.887649e-83


illimina 450k methylation minfi limma • 1.1k views
Entering edit mode
Last seen 1 hour ago
United States

Two reasons. First, dmpFinder is smart enough to do an F-test that doesn't include the intercept (which estimates the mean of surgery group 1), whereas you did an F-test that does include this intercept. That doesn't make any sense; the F-test is testing the null hypothesis that all coefficients are equal to zero, and there is no expectation that the mean M-value for the surgery group 1 should be zero. The correct test would be

topTable(fit, 2:3)

Second, dmpFinder by default doesn't do the empirical Bayes step, whereas you are doing that when using limma directly. With 50 samples or so, I would be surprised to see any materially different results between the two, but there may be some small differences. But the main problem is that you are doing a nonsensical F-test when you use limma directly.

Entering edit mode

I think I got it now and understand the main differences between the dmpfinder and limma. But still wondering if I can perform pairwise comparisons such as 1-2, 2-3, 3-1? Do I need to create a contrast matrix
or does setting the coefficients in toptable works as well? Further I want to add age and ethnicitiy into the analysis. How can I set this to get the results for all the coefficients together? Thank you very much!

Entering edit mode

You will get further, faster by simply reading the limma User's Guide, which covers all these questions and more. If you are planning to use R and Bioconductor to analyze data, you will have to become proficient at consulting the vignettes and help pages that come with the packages you are intending to use. While this support site is an invaluable resource, it is not intended as a substitute for information that is already readily available elsewhere.


Login before adding your answer.

Traffic: 634 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6