Dear Kasper,
I am working on Infinium methylation EPIC data to find DMP and DMR. I am following minfi user guide for the analysis.
This is the Sample_Group column in the sample sheet.
MCF10A_NT_60 |
MCF10A_BPA_60 |
MCF10A_BaP_60 |
MCF10A_BPA-BaP_60 |
MCF10AT1_NT_60 |
MCF10AT1_BPA_60 |
MCF10AT1_BaP_60 |
MCF10AT1_BPA-BaP_60 |
MCF10CA1a-cl1_NT_60 |
MCF10A_NT_60 |
MCF10A_BPA_60 |
MCF10A_BaP_60 |
MCF10A_BPA-BaP_60 |
MCF10AT1_NT_60 |
MCF10AT1_BPA_60 |
MCF10AT1_BaP_60 |
MCF10AT1_BPA-BaP_60 |
MCF10CA1a-cl1_NT_60 |
MCF10A_NT_60 |
MCF10A_BPA_60 |
MCF10A_BaP_60 |
MCF10A_BPA-BaP_60 |
MCF10AT1_NT_60 |
MCF10AT1_BPA_60 |
MCF10AT1_BaP_60 |
MCF10AT1_BPA-BaP_60 |
MCF10CA1a-cl1_NT_60 |
4A5 |
2F1 (wt) |
1G3 (DE4) |
2C2 (DE4) |
MCF7 |
4A5 |
2F1 (wt) |
1G3 (DE4) |
2C2 (DE4) |
MCF7 |
For DMP:
dmp <- dmpFinder(bVals, pheno=targets$Sample_Group, type="continuous")
head(dmp)
intercept beta t pval qval
cg22291627 0.03929582 0.8203689 247.6537 6.546336e-41 5.635203e-35
cg21359254 0.04262945 0.2735875 173.6932 2.278083e-37 9.805074e-32
cg11905617 0.31248646 -0.2829384 -150.5233 6.116164e-36 1.754968e-30
cg01269456 0.11017638 0.2731397 140.8292 2.823188e-35 6.075628e-30
cg20379012 0.88108641 -0.8007840 -126.7986 3.145894e-34 5.416085e-29
cg24417337 0.49293479 -0.4340655 -111.4600 6.076018e-33 8.717243e-28
What if I want DMP for specific comparisons? And dmpFinder returns a data.frame with columns for the statistic, the p-value, the q-value and the intercept. And, if I would like to know which CpG's are hyper- or hypo-methylated from that table how to go for that? How can I extract the tables of differentially expressed CpGs for each comparison I give?
Looking forward to your response. Thank you. @Kasper Daniel Hansen
As you said I used limma package, lmFit. Could you please check the following whether it is the rightway or not. And Could you please tell me about contMatrix once which is below.
table(targets$Sample_Group)
A_BaP A_BPA A_BPABaP A_NT
3 3 3 3
AT1_BaP AT1_BPA AT1_BPABaP AT1_NT
3 3 3 3
CA_NT fourAfive MCFseven oneGthree_DEfour
3 2 2 2
twoCtwo_DEfour twoFone_wt
2 2
cellType <- factor(targets$Sample_Group)
design <- model.matrix(~0+cellType, data=targets)
colnames(design) <- c(levels(cellType))
fit <- lmFit(mVals, design)
contMatrix <- makeContrasts(MCFseven-fourAfive, twoFone_wt-fourAfive, oneGthree_DEfour-fourAfive, twoCtwo_DEfour-fourAfive, levels=design)
fit2 <- contrasts.fit(fit, contMatrix)
fit2 <- eBayes(fit2)
summary(decideTests(fit2))
MCFseven - fourAfive twoFone_wt - fourAfive oneGthree_DEfour - fourAfive
-1 187857 75140 82980
0 332649 518059 521797
1 340312 267619 256041
twoCtwo_DEfour - fourAfive
-1 87335
0 515763
1 257720
annEPICSub <- annEPIC[match(rownames(mVals),annEPIC$Name),c(1:4,12:19,24:ncol(annEPIC))]
DMPs <- topTable(fit2, num=Inf, coef=1, genelist=annEPICSub)
Well, that's sort of a lot of differentially methylated probes, so I am not sure how useful it will be. What exactly does one do with hundreds of thousands of hits? You could whittle that list down to something manageable in one of two ways. You could use bumphunter to look for regions of consistent differential methylation (that's what I would do), or you could use treat instead of eBayes with a relatively large log fold change.
I also have no idea what mVals is. I assume from the name that it's the M-values, rather than the beta values. If not, you should make sure you are using the M-values.
Also note that 'DMPs' only contains the CpG sites for the first contrast.
Yes, mVals is M-Values. M-values have nicer statistical properties and are thus better for use in statistical analysis of methylation data whilst beta values are easy to interpret and are thus better for displaying data. How to whittle that list down to something manageable? Yes I will use bumphunter to detect DMR. Yes, DMPs contain CpG sites for the first contrast. I just gave that as an example. Thank you.
Hi James and Kasper,
Could anyone please tell me how can I apply bumphunter on the above data to find differential methylated regions. Thank you in Advance.
No, I won't do that. Even though I sometimes stray past technical help into tutorial land, that's not my intent. If you are planning to take the responsibility of analyzing your data, then take responsibility for the analysis of your data! If I just tell you what to do, then I am responsible for your analysis, not you. And I am so not down with that.
Pete Hickey and Kasper Hansen just gave a nice tutorial at BioC2016 that you could read. Or you could look at the tutorial for the bumphunter package itself, or do some Google searches.
You are mistaken I didn't want you to do that. I just want to know where I can find the tutorial for bumphunter, cz I didn't find it before. Thanks for posting the tutorial link here.