Question: dmpFinder return value (minfi)
0
gravatar for Biologist
3.1 years ago by
Biologist70
Biologist70 wrote:

Dear Kasper,

I am working on Infinium methylation EPIC data to find DMP and DMR. I am following minfi user guide for the analysis.

 This is the Sample_Group column in the sample sheet.

MCF10A_NT_60
MCF10A_BPA_60
MCF10A_BaP_60
MCF10A_BPA-BaP_60
MCF10AT1_NT_60
MCF10AT1_BPA_60
MCF10AT1_BaP_60
MCF10AT1_BPA-BaP_60
MCF10CA1a-cl1_NT_60
MCF10A_NT_60
MCF10A_BPA_60
MCF10A_BaP_60
MCF10A_BPA-BaP_60
MCF10AT1_NT_60
MCF10AT1_BPA_60
MCF10AT1_BaP_60
MCF10AT1_BPA-BaP_60
MCF10CA1a-cl1_NT_60
MCF10A_NT_60
MCF10A_BPA_60
MCF10A_BaP_60
MCF10A_BPA-BaP_60
MCF10AT1_NT_60
MCF10AT1_BPA_60
MCF10AT1_BaP_60
MCF10AT1_BPA-BaP_60
MCF10CA1a-cl1_NT_60
4A5
2F1 (wt)
1G3 (DE4)
2C2 (DE4)
MCF7
4A5
2F1 (wt)
1G3 (DE4)
2C2 (DE4)
MCF7

For DMP:

dmp <- dmpFinder(bVals, pheno=targets$Sample_Group, type="continuous")
head(dmp)
            intercept       beta         t         pval         qval
cg22291627 0.03929582  0.8203689  247.6537 6.546336e-41 5.635203e-35
cg21359254 0.04262945  0.2735875  173.6932 2.278083e-37 9.805074e-32
cg11905617 0.31248646 -0.2829384 -150.5233 6.116164e-36 1.754968e-30
cg01269456 0.11017638  0.2731397  140.8292 2.823188e-35 6.075628e-30
cg20379012 0.88108641 -0.8007840 -126.7986 3.145894e-34 5.416085e-29
cg24417337 0.49293479 -0.4340655 -111.4600 6.076018e-33 8.717243e-28

What if I want DMP for specific comparisons? And dmpFinder returns a data.frame with columns for the statistic, the p-value, the q-value and the intercept. And, if I would like to know which CpG's are hyper- or hypo-methylated from that table how to go for that? How can I extract the tables of differentially expressed CpGs for each comparison I give?

Looking forward to your response. Thank you. @Kasper Daniel Hansen

 

ADD COMMENTlink modified 3.1 years ago by James W. MacDonald50k • written 3.1 years ago by Biologist70
Answer: dmpFinder return value (minfi)
0
gravatar for James W. MacDonald
3.1 years ago by
United States
James W. MacDonald50k wrote:

Note that dmpFinder is just a simplified wrapper around (primarily) the lmFit function from limma. If you want to do more complex things than a linear fit for continuous data or an F-test for categorical data, you will have to use lmFit directly.

Also note that you have (apparently) categorical data, not continuous data, so it doesn't make any sense to say it's continuous! In this situation you will end up testing just the second coefficient of your design matrix, which will be, um, the difference between whatever R decided was the first factor, and whatever R decided was the second factor (probably a comparison of 2C2(DE4) - 1G3 (DE4)). So you are most likely totally doing the wrong thing here! Or more correctly, you are doing a thing here but it doesn't appear to be what you think you are doing.

ADD COMMENTlink written 3.1 years ago by James W. MacDonald50k

As you said I used limma package, lmFit. Could you please check the following whether it is the rightway or not. And Could you please tell me about contMatrix once which is below.

table(targets$Sample_Group)

           A_BaP            A_BPA         A_BPABaP             A_NT
               3                3                3                3
         AT1_BaP          AT1_BPA       AT1_BPABaP           AT1_NT
               3                3                3                3
           CA_NT        fourAfive         MCFseven oneGthree_DEfour
               3                2                2                2
  twoCtwo_DEfour       twoFone_wt
               2                2

cellType <- factor(targets$Sample_Group)

design <- model.matrix(~0+cellType, data=targets)
colnames(design) <- c(levels(cellType))
fit <- lmFit(mVals, design)

contMatrix <- makeContrasts(MCFseven-fourAfive, twoFone_wt-fourAfive, oneGthree_DEfour-fourAfive, twoCtwo_DEfour-fourAfive, levels=design)

fit2 <- contrasts.fit(fit, contMatrix)
fit2 <- eBayes(fit2)
summary(decideTests(fit2))
   MCFseven - fourAfive twoFone_wt - fourAfive oneGthree_DEfour - fourAfive
-1               187857                  75140                        82980
0                332649                 518059                       521797
1                340312                 267619                       256041
   twoCtwo_DEfour - fourAfive
-1                      87335
0                      515763
1                      257720

annEPICSub <- annEPIC[match(rownames(mVals),annEPIC$Name),c(1:4,12:19,24:ncol(annEPIC))]

DMPs <- topTable(fit2, num=Inf, coef=1, genelist=annEPICSub)

ADD REPLYlink written 3.1 years ago by Biologist70

Well, that's sort of a lot of differentially methylated probes, so I am not sure how useful it will be. What exactly does one do with hundreds of thousands of hits? You could whittle that list down to something manageable in one of two ways. You could use bumphunter to look for regions of consistent differential methylation (that's what I would do), or you could use treat instead of eBayes with a relatively large log fold change.

I also have no idea what mVals is. I assume from the name that it's the M-values, rather than the beta values. If not, you should make sure you are using the M-values.

Also note that 'DMPs' only contains the CpG sites for the first contrast.
 

ADD REPLYlink written 3.1 years ago by James W. MacDonald50k

Yes, mVals is M-Values. M-values have nicer statistical properties and are thus better for use in statistical analysis of methylation data whilst beta values are easy to interpret and are thus better for displaying data. How to whittle that list down to something manageable? Yes I will use bumphunter to detect DMR. Yes, DMPs contain CpG sites for the first contrast. I just gave that as an example. Thank you.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Biologist70

Hi James and Kasper,

Could anyone please tell me how can I apply bumphunter on the above data to find differential methylated regions. Thank you in Advance.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Biologist70
1

No, I won't do that. Even though I sometimes stray past technical help into tutorial land, that's not my intent. If you are planning to take the responsibility of analyzing your data, then take responsibility for the analysis of your data! If I just tell you what to do, then I am responsible for your analysis, not you. And I am so not down with that.

Pete Hickey and Kasper Hansen just gave a nice tutorial at BioC2016 that you could read. Or you could look at the tutorial for the bumphunter package itself, or do some Google searches.

ADD REPLYlink written 3.1 years ago by James W. MacDonald50k

You are mistaken I didn't want you to do that. I just want to know where I can find the tutorial for bumphunter, cz I didn't find it before. Thanks for posting the tutorial link here. 

ADD REPLYlink written 3.1 years ago by Biologist70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 96 users visited in the last hour