dmpFinder return value (minfi)
1
0
Entering edit mode
Biologist ▴ 90
@biologist-9801
Last seen 16 months ago

Dear Kasper,

I am working on Infinium methylation EPIC data to find DMP and DMR. I am following minfi user guide for the analysis.

This is the Sample_Group column in the sample sheet.

 MCF10A_NT_60 MCF10A_BPA_60 MCF10A_BaP_60 MCF10A_BPA-BaP_60 MCF10AT1_NT_60 MCF10AT1_BPA_60 MCF10AT1_BaP_60 MCF10AT1_BPA-BaP_60 MCF10CA1a-cl1_NT_60 MCF10A_NT_60 MCF10A_BPA_60 MCF10A_BaP_60 MCF10A_BPA-BaP_60 MCF10AT1_NT_60 MCF10AT1_BPA_60 MCF10AT1_BaP_60 MCF10AT1_BPA-BaP_60 MCF10CA1a-cl1_NT_60 MCF10A_NT_60 MCF10A_BPA_60 MCF10A_BaP_60 MCF10A_BPA-BaP_60 MCF10AT1_NT_60 MCF10AT1_BPA_60 MCF10AT1_BaP_60 MCF10AT1_BPA-BaP_60 MCF10CA1a-cl1_NT_60 4A5 2F1 (wt) 1G3 (DE4) 2C2 (DE4) MCF7 4A5 2F1 (wt) 1G3 (DE4) 2C2 (DE4) MCF7

For DMP:

dmp <- dmpFinder(bVals, pheno=targets$Sample_Group, type="continuous") head(dmp) intercept beta t pval qval cg22291627 0.03929582 0.8203689 247.6537 6.546336e-41 5.635203e-35 cg21359254 0.04262945 0.2735875 173.6932 2.278083e-37 9.805074e-32 cg11905617 0.31248646 -0.2829384 -150.5233 6.116164e-36 1.754968e-30 cg01269456 0.11017638 0.2731397 140.8292 2.823188e-35 6.075628e-30 cg20379012 0.88108641 -0.8007840 -126.7986 3.145894e-34 5.416085e-29 cg24417337 0.49293479 -0.4340655 -111.4600 6.076018e-33 8.717243e-28 What if I want DMP for specific comparisons? And dmpFinder returns a data.frame with columns for the statistic, the p-value, the q-value and the intercept. And, if I would like to know which CpG's are hyper- or hypo-methylated from that table how to go for that? How can I extract the tables of differentially expressed CpGs for each comparison I give? Looking forward to your response. Thank you. @Kasper Daniel Hansen ADD COMMENT 0 Entering edit mode @james-w-macdonald-5106 Last seen 2 days ago United States Note that dmpFinder is just a simplified wrapper around (primarily) the lmFit function from limma. If you want to do more complex things than a linear fit for continuous data or an F-test for categorical data, you will have to use lmFit directly. Also note that you have (apparently) categorical data, not continuous data, so it doesn't make any sense to say it's continuous! In this situation you will end up testing just the second coefficient of your design matrix, which will be, um, the difference between whatever R decided was the first factor, and whatever R decided was the second factor (probably a comparison of 2C2(DE4) - 1G3 (DE4)). So you are most likely totally doing the wrong thing here! Or more correctly, you are doing a thing here but it doesn't appear to be what you think you are doing. ADD COMMENT 0 Entering edit mode As you said I used limma package, lmFit. Could you please check the following whether it is the rightway or not. And Could you please tell me about contMatrix once which is below. table(targets$Sample_Group)

A_BaP            A_BPA         A_BPABaP             A_NT
3                3                3                3
AT1_BaP          AT1_BPA       AT1_BPABaP           AT1_NT
3                3                3                3
CA_NT        fourAfive         MCFseven oneGthree_DEfour
3                2                2                2
twoCtwo_DEfour       twoFone_wt
2                2

cellType <- factor(targets$Sample_Group) design <- model.matrix(~0+cellType, data=targets) colnames(design) <- c(levels(cellType)) fit <- lmFit(mVals, design) contMatrix <- makeContrasts(MCFseven-fourAfive, twoFone_wt-fourAfive, oneGthree_DEfour-fourAfive, twoCtwo_DEfour-fourAfive, levels=design) fit2 <- contrasts.fit(fit, contMatrix) fit2 <- eBayes(fit2) summary(decideTests(fit2)) MCFseven - fourAfive twoFone_wt - fourAfive oneGthree_DEfour - fourAfive -1 187857 75140 82980 0 332649 518059 521797 1 340312 267619 256041 twoCtwo_DEfour - fourAfive -1 87335 0 515763 1 257720 annEPICSub <- annEPIC[match(rownames(mVals),annEPIC$Name),c(1:4,12:19,24:ncol(annEPIC))]

DMPs <- topTable(fit2, num=Inf, coef=1, genelist=annEPICSub)

0
Entering edit mode

Well, that's sort of a lot of differentially methylated probes, so I am not sure how useful it will be. What exactly does one do with hundreds of thousands of hits? You could whittle that list down to something manageable in one of two ways. You could use bumphunter to look for regions of consistent differential methylation (that's what I would do), or you could use treat instead of eBayes with a relatively large log fold change.

I also have no idea what mVals is. I assume from the name that it's the M-values, rather than the beta values. If not, you should make sure you are using the M-values.

Also note that 'DMPs' only contains the CpG sites for the first contrast.

0
Entering edit mode

Yes, mVals is M-Values. M-values have nicer statistical properties and are thus better for use in statistical analysis of methylation data whilst beta values are easy to interpret and are thus better for displaying data. How to whittle that list down to something manageable? Yes I will use bumphunter to detect DMR. Yes, DMPs contain CpG sites for the first contrast. I just gave that as an example. Thank you.

0
Entering edit mode

Hi James and Kasper,

Could anyone please tell me how can I apply bumphunter on the above data to find differential methylated regions. Thank you in Advance.

1
Entering edit mode

No, I won't do that. Even though I sometimes stray past technical help into tutorial land, that's not my intent. If you are planning to take the responsibility of analyzing your data, then take responsibility for the analysis of your data! If I just tell you what to do, then I am responsible for your analysis, not you. And I am so not down with that.

Pete Hickey and Kasper Hansen just gave a nice tutorial at BioC2016 that you could read. Or you could look at the tutorial for the bumphunter package itself, or do some Google searches.

0
Entering edit mode

You are mistaken I didn't want you to do that. I just want to know where I can find the tutorial for bumphunter, cz I didn't find it before. Thanks for posting the tutorial link here.