Question: edgeR: design matrix for methylation analysis
0
15 months ago by
sgld0
sgld0 wrote:

hi,

when I using edgeR to identify differentially methylated regions (DMR) between different groups with edgeR guidepaper, I have a question about the design matrix .

######

> sam <- rep(samples, each=2)
> meth <- factor(rep(c("Me","Un"),6), levels=c("Un","Me"))
> design <- model.matrix(~ sam + meth)
> colnames(design) <- gsub("sam","",colnames(design))
> colnames(design) <- gsub("meth","",colnames(design))
> colnames(design)[1] <- "Int"
> design <- cbind(design,
+ Me2=c(0,0,0,0,1,0,1,0,0,0,0,0),
+ Me3=c(0,0,0,0,0,0,0,0,1,0,1,0))

The 7th column “Me” represents the methy-lation level (or M-value) in the 40-45 µm group. The 8th column “Me2” represents the differencein methylation level between the 50-55 and the 40-45 µm groups. Finally, the last column “Me3”
represents the difference in methylation level between the 60-65 and the 40-45 µm groups.

######

the contents above from edgeR guide, my question is how to to design the contrast column like 8th column.

edger methylation • 424 views
modified 13 months ago by Gordon Smyth39k • written 15 months ago by sgld0

intercept group is 40-45cm?

if i want to compare the 50-55cm group to 60-65cm group,how to do?I get a answer ,design matrix in GLM .is it ok for edgeR?

the method of edgeR or limma to identify differential  analysis is that regressing each group with linear model,then contrast the different groups' regression to identify difference ?

2
15 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

You are probably following the edgeR methylation workflow from F1000R:

https://f1000research.com/articles/6-2055/

Since publishing that paper, we have added a new function modelMatrixMeth to edgeR to make it much easier for you. You can now simply use

design <- modelMatrixMeth(~group)

where group is your sample grouping factor. Have a look at the revised methylation workflow, which is here:

http://www.statsci.org/smyth/pubs/edgeRMethylationPreprint.pdf

Gordon,what is the role of library size in the differential methylation analysis.  I found there are many "keep.lib.sizes=Fasle " to causes the library sizes to be recomputed in the F1000Research paper.

The article says:

"In the above code, the two library sizes for each sample should be equal. Otherwise, the library size values
are arbitrary and any settings would lead to the same P-value."

Hence the library sizes must not be recomputed automatically, otherwise they would no longer be equal for the two counts from the same sample.

I'm very interested with the updated methylation workflow but the link is broken. Can someone provide the link to that document?

Thanks